University of Nebraska Press
ABSTRACT

Illness Management and Recovery (IMR) was implemented and assessed for fidelity in 4 state psychiatric hospitals over a 6-year period. Differences in the assessment of the structural and clinical elements of the practice were evaluated. The scores for the structural aspects of the program started and remained "fully implemented" throughout the 6 years of observation. The scores for the clinical elements of the program started in the "not implemented" range and were only "partially implemented" throughout the 6 years. Three recommendations to improve clinical fidelity scores are (a) composing IMR groups of consumers considering homogeneity of current functioning, (b) identifying IMR facilitators who are motivated to provide the intervention to fidelity, and (c) using an audit and feedback approach (Ivers et al., 2014) with the IMR Treatment Integrity Scale (McGuire et al., 2015) to shape high-fidelity practice. The implications of the study's findings, study limitations, and areas for future research are discussed.

KEYWORDS

Illness Management and Recovery, inpatient setting, fidelity, evidence-based practices

Introduction

Illness Management and Recovery (IMR) is a standardized illness management program for individuals with schizophrenia and other severe mental illnesses, which is available as a "toolkit" for ease of implementation (Mueser et al., 2006). The IMR program combines five empirically [End Page 102] supported therapeutic strategies: psycho-education, behavioral tailoring, relapse prevention training, social skills training, and coping skills training (Mueser & Gingerich, 2002). The interventions were then incorporated into an IMR program manual that was organized into 11 topical modules. Each module uses a combination of educational, motivational, and cognitive-behavioral teaching strategies (Mueser et al., 2006). The proximal goal of IMR is teaching clients about and supporting the practice of illness management fundamentals, while the distal goal is helping clients make progress toward personal recovery or life goal(s) (Mueser et al., 2006).

In a recent review of IMR outcomes in community programs, McGuire and colleagues (2014) reported on six quasi-experimental studies and three randomized controlled trials (RCTs). All studies found generally positive outcomes, with all three RCTs finding significant improvements in staff and client ratings on a measure of recovery, and two of the RCTs finding reductions in ratings of psychiatric symptoms (McGuire et al., 2014 ).

State hospitals have been criticized for employing a custodial and/or medical model of care in which professionals assess, diagnose, and treat patients, often with little input from the person being treated (Smith & Bartholomew, 2006). The use of IMR in state hospitals has been proposed as a way of increasing a consumer's involvement in their own recovery (Bartholomew & Kensler, 2010). Implementation issues, including measuring fidelity to evidence-based practices (EBPs) in state hospital settings, has not received sufficient attention in the literature. Snyder, Clark, and Jones (2012) raise concerns that evidence-based practices that were studied in the community may lack external validity in institutional settings due to higher levels of illness severity, high heterogeneity of patient functioning, and short durations of care found in some institutions. Few studies of IMR in institutional settings have been reported. One quasi-experimental study conducted in a state psychiatric hospital found that, after controlling for other possible explanations, each hour of IMR participation was associated with a 1.1% reduced risk of readmission after discharge ( Bartholomew & Zechner, 2014).

Evidence-based practices like IMR are not routinely available to most people with severe mental illness ( Bighelli et al., 2016; Briand & Menear, 2014; Lehman et al., 2004; Torrey, Lynde, & Gorman, 2005). Research suggests that "training only" initiatives rarely result in a substantive change [End Page 103] in the clinical behavior of those trained (Briand & Menear, 2014; Torrey, Finnerty, Evans, & Wyzik, 2003) and that passive dissemination of the evidence through journals, conferences, and trainings is ineffective as a means of implementation ( Briand & Menear, 2014). Organizations may, in error, focus on tracking outputs like the number of staff trained or the number of groups provided. What is needed is a secondary focus on the extent to which clinicians use the critical ingredients of the practice and on whether the intervention produced the desired outcomes (Bighelli et al., 2016; Egeland et al., 2017; Kirkpatrick, 1979). The adoption of an EBP requires that the clinical practice be provided in a way that is consistent with the evidence of its effectiveness. Bond, Evans, Salyers, Williams, and Kim (2000) defined the concept of fidelity measurement as

the degree to which a particular program follows a program model. Fidelity measures are tools to assess the adequacy of implementation of program models. A program model is a clearly defined set of interventions and procedures to help individuals achieve some desired goal.

(p. 75)

IMR Fidelity Scale

The IMR Fidelity Scale was based on a review of outcome research performed on the components of the program (Torrey et al., 2005). The 13-item IMR Fidelity Scale is used to assess how closely an IMR program is aligned with the core IMR principles and practices. The scale uses observational criteria to rate each item on a 5-point scale ranging from 1 ( not implemented) to 5 (fully implemented). McHugo and colleagues (2007) defined high fidelity as a total score of 4 or greater out of 5, with scores between 3 and 4 indicating moderate fidelity, and scores less than 3 indicating low fidelity. The scoring reflects the current structure and activities of the program and not future plans. Item ratings are then included in a report with specific comments and recommendations to improve fidelity.

The IMR Fidelity Scale concentrates on two types of fidelity items, with the first four items measuring the program implementation of structural elements of IMR, including group size, length, comprehensiveness, and use of handouts (Bond et al., 2007). The next nine items assess the implementation of clinical elements of IMR such as the use of motivational, cognitive-behavioral, and educational teaching strategies (see Table 1). IMR was studied as part of an eight-state implementation study of five [End Page 104]

Table 1. IMR Fidelity Scale Items by Year
Click for larger view
View full resolution
Table 1.

IMR Fidelity Scale Items by Year

[End Page 105]

EBPs (McHugo et al., 2007), and was described as one of the two most difficult practices to implement. This difficulty was thought to be related to a greater focus on clinical behavior, as fidelity to the IMR model relies more on adherence to clinical interventions (9 items) than on structural aspects (4 items). McHugo and colleagues (2007) state that "structural changes can often be made quickly whereas clinical skills require extensive training and supervision to implement fully" (p. 1283). They suggest that this difference explained why the clinical elements of IMR took longer to reach full implementation than the structural elements. Clinical adherence to the IMR model requires that facilitators of the program know the clinical interventions, can use them competently, and do so during IMR sessions. Clinical interventions are also more difficult to reliably measure than structural areas. Effectively assessing clinical interventions requires an ongoing measurement of the clinical process (Mowbray, Holter, Teague, & Bybee, 2003). Despite concerns about the measurement of clinical elements of IMR with the IMR Fidelity Scale, scores on this scale were associated, in at least one study, with improved clinical outcomes for participants (Hasson-Ohayon et al., 2007).

This study was conducted to assess the challenges to IMR implementation and sustainability in a state hospital system. Specifically, the current analysis examines the structural and clinical elements of IMR, measured using the IMR Fidelity Scale, across 6 years and four state psychiatric institutions.

Method

This study is a multisite, multiyear, retrospective analysis of IMR fidelity in four state psychiatric institutions in the northeastern United States, which includes three 450-bed general adult hospitals and one 200-bed forensic hospital.

Procedures

IMR in the four institutions was assessed for fidelity by embedded university consultants assigned to the institutions as part of a multiyear academic affiliation. Each hospital-specific faculty member provided IMR training, consultation, group supervision, and annual fidelity reviews, though reviews were not carried out at their assigned hospitals. Interrater reliability was calculated for 3 of the 6 years. [End Page 106]

Across the 6 years and four institutions, IMR was provided exclusively in groups by different clinical disciplines including psychiatry, psychology, nursing, social work, and rehabilitation. Reviews were conducted on 46 out of a total of 173 ongoing hospital IMR groups or 27% of all groups during the course of 6 years from 2009 through 2014. The groups that were reviewed were randomly selected from a convenience sample of groups available on the day of the review. Penetration of the IMR program into the hospitals varied from a low of 4% in the forensic site to a high of 43% in one of the general adult sites. Program penetration is defined as the availability of IMR, assuming a maximum of eight patients per group, divided by the number of patients being treated in each hospital multiplied by 100.

Each fidelity review followed the process recommended by the Substance Abuse and Mental Health Services Administration (SAMHSA, 2009) and consisted of (a) IMR program coordinator, group facilitator(s), and patient interviews; (b) group observations; and (c) patient chart reviews.

Groups that were cofacilitated received scores averaged across the facilitator pair. Twenty percent of the IMR groups running at each hospital were selected and assessed for fidelity annually. An aggregate fidelity score was then calculated for each institution and included in a report along with a narrative description of the program. This report was shared with each respective hospital and then compiled into a statewide report that included the results from all four hospitals and was presented to the state mental health authority funding the project.

After each annual review, faculty met for a consensus meeting in which differences in interrater scores and any difficulties identified in the scoring protocol were discussed. A standardized scoring protocol was developed to reduce procedural variability between the reviewers. Scoring difficulties were largely brought about by subtle adaptations to the IMR program. For example, length of stay in an admission unit of approximately three weeks made participation in all 11 IMR modules unrealistic. As a result, scoring for Program Length (Item 2) was focused upon the average number of modules received, not the number planned. In addition, in 2012 faculty agreed to refrain from prompting hospital staff during facilitator interviews regarding examples of strategies that could be used in IMR sessions. This was done to reduce procedural variation between reviewers and to obtain an uncued measure of the self-reported, clinical interventions used in IMR. [End Page 107]

All data was entered into SPSS v.20. IMR fidelity items were sorted into two subgroups, structural or clinical, based on the categorizations from Bond et al. (2009) . Structural items were identified as Items 1 through 4 on the IMR Fidelity Scale. Clinical items were identified as Items 5 through 13.

Results

Means and standard deviations for the fidelity item scores were calculated. Interrater reliability of fidelity reviewers was measured for 3 of the 6 years and achieved an average intraclass correlation coefficient of .84 across nine pairs of faculty, suggesting acceptable reliability. Overall, the structural items scored higher than the clinical items for all years and for all hospitals, F (1,11) = 21.93, p = .001 (see Figure 1 ). One hospital differed significantly from the others, F(3,20) = 6.94, p = .002, and achieved a total mean score across the 13 scale items over 4, indicating full implementation of IMR, for 2 of the 6 years.

Structural fidelity items for all hospitals across 6 years (Items 1–4) ranged from 4.13 to 4.63. This translates into a "high" level of implementation per McHugo et al. (2007). The scores of the clinical fidelity items (Items 5–13) were lower, ranging from 1.04 to 4.04. See Figure 1 for a comparison between the average structural and clinical items. Combining both structural and clinical items, the overall fidelity score across all 6 years and 4 hospitals was 3.24 (see Table 1), which is moderate fidelity (McHugo et al., 2007).

Discussion

IMR in four state psychiatric hospitals was implemented and sustained at moderate fidelity over a 6-year period. Differences in the assessment of the structural and clinical elements of the practice did occur. The average of the structural aspects of the program scored in the "fully implemented" range after the first year and remained so throughout 6 years of observation. The average score of the clinical elements remained in the "partially implemented" range throughout the 6 years. This finding held for each year assessed, occurred despite protocol changes to reduce procedural variation in scoring, and was confirmed during blinded interrater comparisons. This suggests that fidelity to the clinical aspects of IMR is more difficult to achieve in state psychiatric hospitals than fidelity to structural elements. This is similar to previous reports in community settings (Bond et al., 2009 ). [End Page 108]

It is important to note that the various items on the IMR Fidelity Scale have not been calibrated or normed together, meaning that a score of 3 on a structural item is not necessarily equivalent to a score of 3 on a clinical item. It is also not clear what the relative contributions of the structural versus clinical aspects of the program are on clinical outcomes. Despite this, the study findings are consistent with previous evidence about the implementation of IMR (Bond et al., 2009; McGuire et al., 2015) and may suggest that clinically focused interventions like IMR require an additional focus on the facilitator's clinical skills and their utilization.

Figure 1. Mean Structural and Clinical Fidelity Scores Across 6 Years. Note. This figure represents the average clinical and structural scores for four state psychiatric hospitals across 6 years
Click for larger view
View full resolution
Figure 1.

Mean Structural and Clinical Fidelity Scores Across 6 Years. Note. This figure represents the average clinical and structural scores for four state psychiatric hospitals across 6 years

There are four likely explanations for the lower clinical scores over the study period. The first involves the challenge of implementing a recovery-oriented EBP like IMR in a hospital with a long-standing culture predating the recovery model. The medical model in which mental health professionals assess, diagnose, prescribe, and treat, often with little input from the consumer (Bartholomew & Kensler, 2010), is in direct conflict with the philosophy of client centeredness found in IMR. This was periodically observed in IMR groups in the current study, where the consumer's chosen IMR recovery goal was unknown to their treatment team and unrelated to their treatment plan. [End Page 109]

The second explanation for lower clinical scores could involve the implementation strategies used in this project, including the use of embedded university faculty to supervise the project instead of hospital-based supervisors. The drawback to this implementation approach included university faculty providing IMR supervision, instead of the formal hospital leadership. This may have contributed to a university focus on outcomes and fidelity, while the hospital focused on outputs (i.e., the number of groups provided). IMR programming, overseen by the university faculty, had limited performance oversight by the hospital administration. For example, an IMR facilitator who provided many low-fidelity IMR groups would receive the same administrative feedback from their hospital supervisor as a staff providing high-fidelity groups. This challenge to sustainability of high fidelity has been reported in previous IMR implementation studies (e.g., Egeland et al., 2017). Using external trainers and consultants for IMR implementation is a common practice, but it may not help organizations continue the practice once those trainers or consultants terminate services (Egeland et al., 2017). Further, work done on the Re-AIM (reach, efficacy, adoption, implementation, and maintenance) framework for evaluation of health interventions describes a tension between translating effective interventions developed and tested in controlled settings into practice in complex real-world settings, with time-pressured and perhaps unmotivated staff (Glasgow, Vogt, & Boles, 1999).

One common objection to IMR by hospital staff was that the manualized nature of IMR interfered with the facilitator's expertise and autonomy and impeded the use of a more eclectic clinical formulary of therapeutic interventions. While it is true that interventions need to be individualized to each consumer, idiosyncratic approaches, even when provided by expert clinicians, may lead to "black box" outcomes in which it is uncertain what was done and the results, even if effective, may not be replicable (Mowbray, Holter, Teague, & Bybee, 2003). EBPs provided with high fidelity, by contrast, offer not only room for creativity (Bartholomew & Kensler, 2010), but also the opportunity for replicable interventions with demonstrable effectiveness.

The third possible explanation for lower clinical fidelity involves what constitutes a realistic level of programmatic fidelity given the hospital's expectations of staff and the barriers presented by an inpatient setting. Specifically, the concerns raised by Snyder et al. (2012) for inpatient use of EBPs should be considered, including the heterogeneity of the current level [End Page 110] of patient functioning, illness severity, and the varying lengths of patient stay. One site in this study, with only one IMR group, did achieve high fidelity (more than an average score of 4) in 2 of the 6 years. This site, with the lowest penetration in the study (4%), was subject to large swings in fidelity and instability because the only trained facilitator left and the subsequent facilitator required time to become proficient in IMR. This site selected patients purposefully for participation in IMR to build a cohesive, more homogenous, and motivated group.

Last, the implementation efforts were partly guided by results of yearly reviews using the IMR programmatic Fidelity Scale. This scale, as the name implies, is intended for use at the level of the program and not at the level of the clinician. The scale offers little specific feedback on what the facilitators of IMR should be doing clinically. One solution is the use of an audit and feedback approach (Ivers et al., 2014) with a validated measure, focused on clinician behaviors in IMR. A new validated measure of the clinical aspects of IMR is available, called the IMR Treatment Integrity Scale (IT IS). Higher scores on the IT IS were associated with higher scores on a measure of recovery (McGuire et al., 2015). This scale could be used to offer IMR facilitators clinical feedback about their provision of IMR to support staff to improve their skills.

Conclusion

The longitudinal examination of IMR fidelity in state hospitals demonstrates that implementation can be maintained over time and contrasts with other reports of poor sustainability of IMR over time (Egeland et al., 2017). Despite the difficulties, the application of a standardized treatment fidelity tool was novel in this environment, and offers hope that such clinical measurement of services can be used in these settings. By understanding the challenges of using existing fidelity measures in real-world settings, future work can address the practical strategies necessary to ensure the critical elements of the practice are occurring. The use of IMR in the hospitals took years to became even a small part of the routine and everyday culture of the organizations as is recommended for long-term maintenance and adoption of an EBP (Glasgow et al., 1999 ).

One recommendation to improve clinical fidelity scores involves composing IMR groups with consumers interested in IMR with an eye toward homogeneity of current functioning. Second, researchers could identify staff to provide IMR who are interested in and motivated to provide IMR [End Page 111] to fidelity. A third recommendation is to use an audit and feedback approach (Ivers et al., 2014) using the IT IS (McGuire et al., 2015) to shape high clinical fidelity practice. Audit and feedback involves comparing the observation of clinical practice to practice standards for the purposes of improved interventions.

Future research in this area should explore the role of EBPs, including IMR, in the culture of state hospital systems. Also important are the unique implementation barriers in these institutions, including high staff attrition, short patient stays, and high heterogeneity of current patient functioning. Additional issues for future research include the need to develop a clear model of oversight, including consultation and supervision of IMR using audit and feedback, with a specific aim of improving staff's clinical practice. Last, the Re-AIM framework for evaluation of health interventions may be particularly helpful in understanding levels of implementation and feasibility of EBPs in real-world settings (Glasgow et al., 1999). [End Page 112]

Tom Bartholomew
tom bartholomew, MA, Assistant Professor, Rutgers University
Michelle R. Zechner
michelle r. zechner, PhD, Assistant Professor, Rutgers University
Joseph Birkmann
joseph birkmann, MSW, Assistant Professor, Rutgers University
Dawn L. Reinhardt-Wood
dawn l. reinhardt-wood, MA, CPRP, Lecturer, Rutgers University
Kenneth Kinter
kenneth kinter, MA, LPC, Assistant Professor, Rutgers University
Jennifer Sperduto
jennifer sperduto, MS, Lecturer, Rutgers University
Ruth Cook
ruth cook, MS, Regional Director, Triple C Housing, Adjunct Assistant Professor, Rutgers University
Michael Giantini
michael giantini, PhD, MA, Clinical Psychologist, Trenton Psychiatric Hospital

acknowledgment

We would like to thank Teresa McQuaide for her support in the completion of this project.

references

Bartholomew, T., & Kensler, D. (2010). Illness management and recovery in state psychiatric hospitals. American Journal of Psychiatric Rehabilitation , 13(2), 105–125. doi:10.1080/15487761003756977
Bartholomew, T., & Zechner, M. (2014). The relationship of illness management and recovery to state hospital readmission. Journal of Nervous and Mental Disease, 202(9), 647– 650. doi:10.1097/NMD.0000000000000177
Bighelli, I., Ostuzzi, G., Girlanda, F., Cipriani, A., Becker, T., Koesters, M., & Barbui, C. (2016). Implementation of treatment guidelines for specialist mental health care. Cochrane Database of Systematic Reviews 2016(2), Art. No. CD009780. doi:10.1002/14651858.CD009780.pub3
Bond, G. R., Drake, R. E., Mchugo, G. J., Rapp, C. A., & Whitley, R. (2009). Strategies for improving fidelity in the national evidence-based practices project. Research on Social Work Practice, 19(5), 569–581. doi:10.1177/1049731509335531
Bond, G. R., Evans, L., Salyers, M. P., Williams, J., & Kim, H.-W. (2000). Measurement of fidelity in psychiatric rehabilitation. Mental Health Services Research, 2(2), 75–87. doi:10.1023/A:1010153020697
Briand, C., & Menear, M. (2014). Implementing a continuum of evidence-based psychosocial interventions for people with severe mental illness: Part 2—review of critical implementation issues. Canadian Journal of Psychiatry, 59(4), 187–195. doi:10.1177/070674371405900403
Egeland, K. M., Ruud, T., Ogden, T., Färdig, R., Lindstrøm, J. C., & Heiervang, K. S. (2017). How to implement Illness Management and Recovery (IMR) in mental health service settings: Evaluation of the implementation strategy. International Journal of Mental Health Systems, 11(1), 13. doi:10.1186/s13033-017-0120-z
Glasgow, R. E., Vogt, T. M., & Boles, S. M. (1999). Evaluating the public health impact of health promotion interventions: the RE-AIM framework. American Journal of Public Health, 89(9), 1322–1327.
Hasson-Ohayon, I., Roe, D., & Kravetz, S. (2007). A randomized controlled trial of the effectiveness of the illness management and recovery program. Psychiatric Services, 58(11), 1461– 1466. doi:10.1176/appi.ps.58.11.1461
Ivers, N. M., Sales, A., Colquhoun, H., Michie, S., Foy, R., Francis, J. J., & Grimshaw, J. M. (2014). No more "business as usual" with audit and feedback interventions: Towards an agenda for a reinvigorated intervention. Implementation Science , 9(14), 1–14. doi:10.1186/1748-5908-9-14
Kirkpatrick, D. L. (1979). Techniques for evaluating training programs. Training and Development Journal, 33(6), 78–92.
Lehman, A. F., Kreyenbuhl, J., Buchanan, R. W., Dickerson, F. B., Dixon, L. B., Goldberg, R., . . . Steinwachs, D. M. (2004). The Schizophrenia Patient Outcomes Research Team (PORT): Updated treatment recommendations 2003. Schizophrenia Bulletin , 30(2), 193–217. doi:10.1093/oxfordjournals.schbul.a007071
McGuire, A. B., Kukla, M., Green, A., Gilbride, D., Mueser, K. T., & Salyers, M. P. (2014). Illness management and recovery: A review of the research. Psychiatric Services (Washington, D.C.), 65(2), 171– 179. doi:10.1176/appi.ps.201200274
McGuire, A. B., White, D. A., Bartholomew, T., Flanagan, M. E., McGrew, J. H., Rollins, A. L., . . . Salyers, M. P. (2015). The relationship between provider competence, content exposure, and consumer outcomes in illness management and recovery programs. Administration and Policy in Mental Health and Mental Health Services Research, 44 (1), 81– 91. doi:10.1007/s10488-015-0701-6
McHugo, G. J., Drake, R. E., Whitley, R., Bond, G. R., Campbell, K., Rapp, C. A, . . . Project, E. P. (2007). Fidelity outcomes in the National Implementing Evidence-Based Practices Project. Psychiatric Services (Washington, D.C.), 58(10), 1279–1284. doi:10.1176/appi.ps.58.10.1279
Mowbray, C. T., Holter, M. C., Teague, G. B., & Bybee, D. (2003). Fidelity criteria: Development, measurement, and validation. American Journal of Evaluation, 24(3), 315–340. doi:10.1016/S1098-2140(03)00057-2
Mueser, K. T., & Gingerich, S. (2002). Illness management and recovery implementation resource kit. Rockville, MD: U.S. Substance Abuse and Mental Health Services Administration, Center for Mental Health Services.
Mueser, K. T., Meyer, P. S., Penn, D. L., Clancy, R., Clancy, D. M., & Salyers, M. P. (2006). The illness management and recovery program: Rationale, development, and preliminary findings. Schizophrenia Bulletin, 32(Suppl. 1), 32–43. doi:10.1093/schbul/sbl022
Smith, R. C., & Bartholomew, T. (2006). Will hospitals recover? The implications of a recovery-orientation. American Journal of Psychiatric Rehabilitation , 9(2), 85– 100. doi:10.1080/15487760600875982
Snyder, J. A., Clark, L. M., & Jones, N. T. (2012). Provision and adaptation of group treatment in state psychiatric hospitals. Professional Psychology: Research and Practice, 43(4), 395–402. doi:10.1037/a0029377
Substance Abuse and Mental Health Services Administration. (2009). Illness management and recovery: Evaluating your program (HHS Pub. No. SMA-09-4462). Rockville, MD: Author.
Torrey, W. C., Finnerty, M., Evans, A., & Wyzik, P. (2003). Strategies for leading the implementation of evidence-based practices. Psychiatry (Abingdon) , 26, 883–897. doi:10.1016/S0193-953X(03)00067-4
Torrey, W. C., Lynde, D. W., & Gorman, P. (2005). Promoting the implementation of practices that are supported by research: The National Implementing Evidence-Based Practice Project. Child and Adolescent Psychiatric Clinics of North America, 14(2), 297– 306. doi:10.1016/j.chc.2004.05.004

Share