Article Navigation
Article Contents
-
Abstract
-
Importance of Measuring Intervention Fidelity
-
Five Dimensions of Fidelity
-
Measurement of Individual Dimensions of Fidelity
-
Initial Development of an Intervention-Specific Fidelity Measure
-
Use of Fidelity Measurement in Clinical Practice
-
Conclusion
-
Author Contributions and Acknowledgments
-
Funding
-
Disclosures
-
References
Journal Article
, Mihee An Department of Physical Therapy , Duquesne University, Pittsburgh, Pennsylvania Search for other works by this author on: Stacey C Dusing Department of Physical Therapy , Virginia Commonwealth University, Richmond, Virginia Search for other works by this author on: Regina T Harbourne FAPTA , Department of Physical Therapy, Duquesne University, 600 Forbes Avenue, Pittsburgh, PA 15282 (USA) Dr Harbourne is a board-certified clinical specialist in pediatric physical therapy. Address all correspondence to Dr Harbourne at: harbourner@duq.edu. Search for other works by this author on: Susan M Sheridan Nebraska Center for Research on Children , Youth, Families and Schools, University of Nebraska-Lincoln, Lincoln, Nebraska Search for other works by this author on:
Physical Therapy, Volume 100, Issue 5, May 2020, Pages 757–765, https://doi.org/10.1093/ptj/pzaa006
Published:
16 January 2020
Article history
Revision requested:
28 February 2019
Revision received:
01 July 2019
Accepted:
06 October 2019
Received:
16 January 2020
Published:
16 January 2020
-
PDF
- Split View
- Views
- Article contents
- Figures & tables
- Video
- Audio
- Supplementary Data
-
Cite
Cite
Mihee An, Stacey C Dusing, Regina T Harbourne, Susan M Sheridan, START-Play Consortium, What Really Works in Intervention? Using Fidelity Measures to Support Optimal Outcomes, Physical Therapy, Volume 100, Issue 5, May 2020, Pages 757–765, https://doi.org/10.1093/ptj/pzaa006
Close
Search
Search Menu
Abstract
A critical factor to move the field of physical therapy forward is the measurement of fidelity during comparisons of interventions. Fidelity translates as “faithfulness”; thus, fidelity of intervention means faithful and correct implementation of the key components of a defined intervention. Fidelity measurement guards against deviations from, or drift in, the delivery of a targeted intervention, a process necessary for evaluating the efficacy of rehabilitation approaches. Importantly, attention to fidelity measurement differentiates rehabilitation approaches from each other. However, earlier research comparing physical therapist interventions often reported findings without careful attention to fidelity measurement. The purpose of this paper is 2-fold: (1) to support the development of intervention-specific fidelity measures in physical therapy research as the gold standard for translating research findings to clinical practice, and (2) to describe the process of creating a multi-dimensional fidelity measurement instrument in rehabilitation intervention. Improved attention to fidelity measurement will allow the rehabilitation field to communicate interventions clearly with a direct link to outcomes and target the implementation of our improved intervention for the right patient problem with the right dose and the right ingredients at the right time.
Intervention fidelity refers to the degree to which a specific intervention is implemented as intended,1,2 critically supporting effectiveness research. Fidelity measurement underlies the reliable examination of the effect of interventions, particularly when 2 or more interventions are compared.3,4 When comparing treatments, fidelity measurement ensures the accurate presentation and examination of a rehabilitation approach and prevents potentially false conclusions. For example, imagine a clinical research study that compares 2 interventions (conventional intervention A and new intervention B) with a hypothesis that the new intervention B will be more effective than intervention A on a specific outcome. If intervention B is implemented poorly (eg, the interventionists do not implement some of the key components), there may not be a clear difference between the interventions. Upon completion of the intervention, when the outcomes between groups are compared, there may be no difference between the 2 groups. In this example, the lack of support for the new intervention B may be because it is not as effective as expected; alternatively, it may be due to inadequate implementation of intervention B or the implementation of something different from either A or B. It is impossible to determine poor effectiveness versus poor fidelity of implementing the intervention unless fidelity measurement occurred during the study. In addition, if other researchers replicate the research but maintain high fidelity of intervention B, intervention B may result in stronger outcomes than intervention A. In the end, the 2 studies present opposing results, making it difficult to draw a conclusion about the effectiveness of the new intervention and difficult to translate these competing results to clinicians.
Intervention fidelity has been addressed in psychological and health behavioral research, but only recently have studies of intervention using fidelity measurement in physical therapy been published.5–7 Although there are a few studies that have developed and used a fidelity measure in physical therapy,8,9 attention to fidelity measurement is still very limited. A recent overview extracted a random sample of 200 reports of randomized controlled trials from the Physiotherapy Evidence Database and evaluated completeness of the descriptions of physical therapist interventions using the Template for Intervention Description and Replication checklist.10 For the intervention groups, 23% of the trials were incompletely described according to the TIDier guideline, and 75% of trials incompletely described the control group intervention, indicating incomplete fidelity of the majority intervention practices in physical therapy randomized controlled trials. Most studies investigating interventions for stroke and osteoarthritis did not clearly describe how interventions were developed or monitored during the trial.5,6 Recent reviews of both adult neurologic11 and pediatric interventions12 reported serious methodologic concerns of most studies due to a lack of fidelity measurement, thus reducing the impact of the conclusions that could be drawn from the studies. Attention to fidelity measurement during comparisons of interventions is vital to optimize rehabilitation research and evidence-based clinical care, but it is currently lacking.
The purpose of this perspective is to support the importance of using a unique, multi-dimensional fidelity measure for any comparison or description of interventions to advance rehabilitation effectiveness and efficacy. Additionally, we describe in detail the process of creating an intervention-specific fidelity measurement tool to examine a new intervention.
Importance of Measuring Intervention Fidelity
Adequate fidelity measurement and reporting of intervention fidelity improves the interpretability of the outcome data in research studies as well as the replicability of the intervention, thus easing clinical translation. Fidelity measurement also adds to the rigor of the analysis when statistically comparing groups and adds to the internal validity of the research project. Although researchers clearly understand the dangers of Type I error (a false-positive result or incorrectly rejecting a null hypothesis that is true) and Type II error (false-negative result or failing to reject the null hypothesis when it is false), another important type of error (Type III) has received little attention. Type III error has been described as getting the right answer for the wrong reason.13 Regarding the relationship to fidelity measurement, when a mismatch occurs between the intended intervention and the delivered intervention, researchers end up evaluating something that did not occur, therefore confounding the actual research question (Type III error).13,14 When a Type III error occurs, research findings may provide misleading information. For instance, stakeholders may decide to stop effective programs based on negative results that derive from poor implementation (due to inadequate fidelity of the planned intervention) rather than an ineffective intervention. Without the proper evaluation of intervention fidelity, researchers and clinicians cannot determine whether the lack of impact is due to the intervention itself (eg, poorly conceptualized, wrong theory, or invalid intervention) or poor translation of the intervention to the implementation phase.
Higher levels of intervention fidelity have been shown to contribute to better intervention outcomes.3,15 For example, a recent review evaluating data from more than 500 studies found that when interventions are well implemented (with high fidelity), the magnitude of mean effect sizes are at least 2 to 3 times higher than when low fidelity of intervention exists.15 When fidelity measurement of a specific intervention occurs, that quantity can be used as a moderator in the statistical examination of treatment effects, thereby revealing that the intervention is differentially effective under varying levels of intervention fidelity.16,17 Thus, support for the inclusion of fidelity measures when examining intervention is strong.
As the importance of intervention fidelity has been established, attention to creating strong fidelity measurement has increased. Intervention fidelity can be assessed by either generic measures or intervention-specific measures.9,18,19 Generic fidelity measures evaluate general attributes of intervention that are essential for any intervention (eg, therapist-family rapport) and strategies that are commonly used across similar interventions.19,20 Although generic fidelity measures can be broadly applied to theoretically similar interventions, they are insufficient to capture the unique components of an intervention that distinguish it from other interventions. Intervention-specific fidelity measures evaluate key components of a particular intervention and allow for replicating efficacy or effectiveness studies on the intervention. It is particularly important to develop and use a well-designed, intervention-specific fidelity measure that reflects the key components of that intervention when a new intervention is examined or compared with existing practice. Below, we describe the dimensions of fidelity measurement needed to create an intervention-specific instrument.
Five Dimensions of Fidelity
Intervention fidelity is a multidimensional, measurable construct with both quantitative and qualitative dimensions. Fidelity is commonly described with 5 dimensions amenable to measurement: adherence, dosage, quality of intervention delivery, participant responsiveness, and program differentiation (Figure).3,21,22Adherence measures the adequate delivery of the key components of an intervention; in other words, delivery of the intervention as it was designed. Dosage describes the amount of a specific intervention delivered, often measured in terms of time. Quality of intervention delivery is the way interventionists deliver the intervention using overall processes or strategies as prescribed by developers. Participant responsiveness indicates the extent to which participants respond to or are engaged by the intervention and is another overall qualitative judgement. Lastly, program differentiation involves the extent to which the components and processes of the intervention being studied differ from other interventions (eg, in a comparison of interventions study). Monitoring of the comparison group is important for program differentiation because the compared (often the standard care) intervention may contain some of the same key components as the target intervention. Program differentiation can be key to identifying active ingredients that are essential for successful intervention. Below, we further examine each fidelity dimension and the way in which it can be measured.
Table
Example Approaches to Measure Multiple Dimensions of Intervention Fidelitya
SOLAS Study 9 , 24 | Getting Ready Study 18 | |
---|---|---|
Methods of data collection | Checklist by self-report, direct observation, and audio recording | Behavioral coding by observation of video recording |
Fidelity measurement for comparison group | N (no comparison group) | Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group) |
Measurement of each fidelity dimension | ||
Adherence | 25 components for each of 6 weekly sessions were identified | 11 intervention strategies were identified |
Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1) | Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure) | |
Overall adherence score was computed by summing scores of all components | Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use) | |
Dosage | Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min | Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming |
Quality of intervention delivery | Not measured | Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent |
Participant responsiveness | Not measured | Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions) |
Program differentiation | Not measured | Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness) |
SOLAS Study 9 , 24 | Getting Ready Study 18 | |
---|---|---|
Methods of data collection | Checklist by self-report, direct observation, and audio recording | Behavioral coding by observation of video recording |
Fidelity measurement for comparison group | N (no comparison group) | Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group) |
Measurement of each fidelity dimension | ||
Adherence | 25 components for each of 6 weekly sessions were identified | 11 intervention strategies were identified |
Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1) | Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure) | |
Overall adherence score was computed by summing scores of all components | Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use) | |
Dosage | Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min | Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming |
Quality of intervention delivery | Not measured | Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent |
Participant responsiveness | Not measured | Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions) |
Program differentiation | Not measured | Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness) |
a Getting Ready = relationship-based school readiness intervention for children from birth to age 5; SOLAS = Self-management of Osteoarthritis and Low back pain through Activity and Skills.
Table
Example Approaches to Measure Multiple Dimensions of Intervention Fidelitya
SOLAS Study 9 , 24 | Getting Ready Study 18 | |
---|---|---|
Methods of data collection | Checklist by self-report, direct observation, and audio recording | Behavioral coding by observation of video recording |
Fidelity measurement for comparison group | N (no comparison group) | Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group) |
Measurement of each fidelity dimension | ||
Adherence | 25 components for each of 6 weekly sessions were identified | 11 intervention strategies were identified |
Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1) | Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure) | |
Overall adherence score was computed by summing scores of all components | Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use) | |
Dosage | Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min | Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming |
Quality of intervention delivery | Not measured | Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent |
Participant responsiveness | Not measured | Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions) |
Program differentiation | Not measured | Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness) |
SOLAS Study 9 , 24 | Getting Ready Study 18 | |
---|---|---|
Methods of data collection | Checklist by self-report, direct observation, and audio recording | Behavioral coding by observation of video recording |
Fidelity measurement for comparison group | N (no comparison group) | Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group) |
Measurement of each fidelity dimension | ||
Adherence | 25 components for each of 6 weekly sessions were identified | 11 intervention strategies were identified |
Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1) | Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure) | |
Overall adherence score was computed by summing scores of all components | Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use) | |
Dosage | Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min | Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming |
Quality of intervention delivery | Not measured | Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent |
Participant responsiveness | Not measured | Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions) |
Program differentiation | Not measured | Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness) |
a Getting Ready = relationship-based school readiness intervention for children from birth to age 5; SOLAS = Self-management of Osteoarthritis and Low back pain through Activity and Skills.
Measurement of Individual Dimensions of Fidelity
Although intervention fidelity optimally includes 5 dimensions, only 1 or 2 dimensions of fidelity (eg, adherence, dosage) have been measured in previous intervention research.3,21 There is no study measuring all 5 dimensions of intervention fidelity in physical therapy research. Because each dimension is an important aspect of fidelity, optimal fidelity measurement tools should be multi-dimensional. The 5 dimensions of fidelity together measure both structure (eg, adherence and dosage) and process (eg, quality of delivery).22,23 Fidelity to structure measures both the content and the amount of the intervention. Fidelity to process captures the way in which the intervention is delivered and involves interactions between therapist and patient/family. Fidelity measurement of process requires qualitative judgment, which may be more difficult to measure reliably but may be significant in terms of intervention effects.23 To capture a comprehensive picture of intervention fidelity and consequently produce an accurate examination of the effectiveness of an intervention, measurement of all 5 dimensions of fidelity is recommended.3,21 However, evaluating all 5 dimensions is not always applicable or practical. Thus, it falls to the researcher to develop a unique fidelity measure that is relevant to their research question and to the outcomes of the specific intervention, with appropriate selection of the 5 dimensions as needed for the study. Measurement approaches for each of the 5 dimensions of fidelity with 2 example studies are presented below and in the Table.
Adherence
Adherence can be quantified based on whether each key component of an intervention is present or absent. If a key component only needs to occur once, then a single score of present or absent is adequate. For instance, in research examining the feasibility of a group-based complex behavioral intervention to promote self-management for people with osteoarthritis or chronic low back pain (SOLAS study), researchers developed an intervention fidelity protocol and checklists consisting of approximately 25 components for each of 6 weekly sessions. Each component was rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1), and the overall adherence score was computed by summing the scores of all components.9,24 Another option, if more detail is needed, is to use the percent of occurrence of each component. For example, in research examining the efficacy of the Getting Ready intervention (promoting school readiness through enhancing parent engagement), researchers defined 11 strategies of the Getting Ready intervention and recorded interventionists’ use of individual strategies using 1-minute partial-interval recording procedures.18 Use of each strategy was coded as present if it was observed at all during 1-minute intervals; then, the proportion of time (percent of the session) in which each strategy was used was calculated. These 2 examples both show measures of adherence; the SOLAS study used counts, and the Getting Ready study used percentages. Both methods provided quantification of the adherence dimension of fidelity.
Dosage
Measurement of dosage must be defined and can be quantified by using number of sessions completed, overall duration of the intervention, or time spent implementing key components of the intervention as designed. Measurement of dosage supplies crucial information such as completeness of delivery and the difference in dosage between groups. For example, in the SOLAS study,24 the researchers documented the duration of each session and compared the actual duration of the exercise component with its intended duration of 45minutes (as a measurement of completeness of delivery).24 In the Getting Ready study,18 dosage was defined as the number of home visit sessions completed.
Quality of Intervention Delivery and Participant Responsiveness
Gestalt variables are suitable for evaluating fidelity to intervention process dimensions (eg, quality of intervention delivery and participant responsiveness). To use gestalt items (eg, overall quality ratings), criteria for the indicators should be clearly defined based on the theory or principles underlying the intervention. For example, early intervention and pediatric rehabilitation programs based on family-centered principles include interactive activities to facilitate family engagement and collaborative intervention. Therefore, the quality of interaction and the degree to which interactive activities focus on the key components are important to measure. In the Getting Ready intervention study,18 the researchers used global rating scales with scores ranging from 1 (low) to 4 (high) to measure quality of intervention delivery and participant responsiveness (Table).18 As a measure of the quality of intervention delivery, the overall effectiveness of the early childhood professional was rated based on the professional’s ability to provide opportunities for collaboration, to initiate meaningful conversations and discussions with the parent, and to focus on the parent–child relationship. For participants’ responsiveness, the researchers rated parental level of interest and engagement with the professional (eg, parent engagement in discussion and brainstorming and active participation in activities) and with their child (eg, engaging in and starting meaningful conversations and interactions with their child).
Program Differentiation
Lastly, to measure program differentiation, researchers may use data from the other 4 dimensions of fidelity to determine differences between interventions. For example, researchers may measure therapists in both the experimental and standard care groups to see whether they implement key components of the assigned intervention (adherence), and with what amount (dosage) and in what way (quality of delivery, participant responsiveness). In the Getting Ready study,18 program differentiation was determined based on a combination of the rate of strategy implementation (adherence), ratings of professionals’ effectiveness in providing the Getting Ready intervention strategies (quality of intervention delivery), and ratings of parental engagement (participants’ responsiveness) in experimental and comparison (typical early intervention) groups. Measuring intervention fidelity in both groups allowed the researchers to identify components of the Getting Ready intervention that were also used in the standard care group and verify the key components of the Getting Ready intervention that were clearly different from typical early intervention. They found that some strategies were implemented with similar rates in both groups, showing they are generic strategies of any early intervention programs; some strategies were implemented with significantly higher rates in the experimental group, indicating they were unique to the Getting Ready intervention.
Initial Development of an Intervention-Specific Fidelity Measure
Creating an intervention-specific fidelity tool is generally a 3-step process.23 The first, identifying possible indicators or key components of the approach by either expert consensus or previous research, is critical; this step is often repeated at the end of the process to assure the inclusion of all key components. The second step is to establish a measurement system, which involves decisions about how to measure the key components (eg, direct observation, coding video, or use of products such as written notes from the intervention) and how to determine if the intervention is implemented with acceptable fidelity. The final step is to examine reliability and validity of the fidelity measurement instrument. One practical component of developing the fidelity measurement instrument is the cost of time and money. Fidelity measurement adds cost to any study, which should be considered during the steps of the design stage to adequately plan for inclusion of manpower and for budgetary reasons when requesting funding.
Identifying Key Components
The first step in developing a fidelity measure is the identification of key components of an intervention.25,26 A new intervention, or the intervention of interest in any training or comparison of effectiveness research, presents with particular claims regarding the content, process, training requirements, and potential outcome of that intervention. Ideally, the successful intervention components have been identified through prior investigations. In actual practice, therapists often “telegraph” the content of an intervention using shorthand (eg, a functional approach or a top-down approach). Successful development of a fidelity tool requires that an intervention be reduced to several key components that differentiate the intervention from other, possibly similar approaches for the same clinical problem. These key components of the intervention should fully define what the intervention is targeting and the portions of the intervention that, if removed, would change the intervention as a whole. Thus, the exploratory phase of developing a fidelity tool involves discussions with experts in interventions used for the clinical problem being addressed and examination of examples of the intervention of interest as well as counter-examples and the use of evidence-based information. Because rehabilitation is a dynamic and interactive endeavor, many interventions will have common components, such as patient/therapist discussions, physical contact between patient and therapist, and movement practice. The development of the key component list must include the behaviors of the therapist, which can be observed within the course of the intervention and which fit within the theory driving the intervention. As part of the process of selecting key components, observation of videos of both the new and the standard-of-care interventions as well as brain-storming and critically discussing each component helps contribute to identification, naming, and definition of the items.
Critical to the process of identifying key components of an intervention is the understanding of the theory underlying the intervention of interest. For example, the Getting Ready intervention is based on the important role that parents play in preschool readiness.18,27 The Getting Ready intervention focuses on promoting parent-child interactions and parents’ competence and confidence in maximizing children’s natural learning opportunities through collaborative partnerships between parents and professionals, which is different from other EI programs focusing on skills a child fails to perform. Therefore, the key components of the Getting Ready intervention include collaborative strategies used by professionals to support parents (eg, asking for and building on parental observations of child development, offering affirmation, encouraging parents to discuss and prioritize concerns and needs for their child).
Establishing the Measurement System
Once the key components of the intervention are defined, the second step in development of the fidelity measurement tool is to determine how to quantify the key components as well as other dimensions of fidelity. Self-report by participants and ratings by experts or trained raters are common methods to assess fidelity. For self-report measures, key components or steps comprising an intervention are listed, and participants document if each of the components is completed. Self-report measures can be simple and less reliant on resources (eg, human, time, material) but may result in overestimation of fidelity.22 Observation (direct observation or watching videos of intervention) involves trained and reliable individuals, ideally independent from the research, documenting whether each of key components occurred. It is more objective but more resource intensive than self-report and less commonly used in intervention studies. Using independent observers can add considerable cost to a study and needs careful calculation to add to budget estimates. As each method has advantages and disadvantages, it falls to developers to determine a measurement method that is appropriate and sensitive to capturing the key components of an intervention. In some cases, more than one approach to measurement can be used to fully measure intervention fidelity (eg, video observation by independent raters to measure adherence and quality of intervention delivery, self-report by patient/family to measure participant responsiveness, chart review for attendance rate as a measurement of dosage).
Developers also need to establish criteria to determine if the intervention is delivered with an acceptable level of fidelity. Developers may use previously reported levels of fidelity or establish their own criteria for an intervention. For example, self-report by therapists was used to measure fidelity in the SOLAS study24 and was based on previously reported criteria; 80% to 100% adherence to the fidelity checklist (ie, >80% of items on the checklist were implemented) was interpreted as high fidelity, 51% to 79% as moderate, and 50% or below as low fidelity.28–30 In the Getting Ready intervention study, no specific predetermined levels of fidelity to the Getting Ready intervention were indicated. Instead, as a first approximation, the researchers defined good fidelity to be total strategy use of 50% (ie, observed by independent rater in over one-half of the intervals).18
To establish criteria for a new intervention, developers may choose to determine a likely upper limit of fidelity by scoring sessions that the team and experts deem to be good examples of the target intervention. Subsequently, sessions of the comparison group can be scored (eg, standard intervention) as well as sessions that barely implement the target intervention. With these 3 data points (from well implemented to barely implemented intervention to standard intervention), the range of fidelity scores can determine an acceptable range of fidelity (eg, acceptable if each of the key components of the intervention is implemented above a certain proportion/frequency) or levels of fidelity (eg, high, moderate, low fidelity). The criteria may also be different depending on the type of research or practice. For example, for an efficacy trial, researchers may set very high standards of adherence (eg, adherence should be >80%, quality of delivery should not be <3 in a scale of 1 to 4). If researchers are interested in how intervention fidelity affects the outcomes, they may set levels of fidelity (eg, high, moderate, or low fidelity) and use the fidelity value as a moderator when analyzing outcome data. In an effectiveness trial examining the intervention in a variety of clinical settings, the standards may be lower (eg, adherence >50% is acceptable) because control is reduced, and the goal is to see how the intervention fares when translated to clinical practice; the lower standard of fidelity may be expected when broadening the scope to multiple interventionists, facilities, or locations.
Testing Reliability and Validity
The third step in developing a fidelity measure is to test the measurement system in terms of reliability and validity. There are several ways to test psychometric properties of fidelity measures.23 First, developers can examine reliability of the fidelity measurement tool across respondents (eg, inter-rater reliability on behavioral coding) by calculating percentage of agreement, coefficient kappa, intraclass correlation coefficient, or Pearson correlation coefficients. This testing will demonstrate the clarity of the definitions of key components. One way to improve and clarify the definitions of key components is to have multiple raters observe and code the same intervention session (this can be done by direct observation or video-tape review) and compare coding results. For components on which the raters disagree, raters and developers discuss, refine, and add more examples as needed. This process is repeated until the raters reach an acceptable inter-rater reliability. Second, examining differences in fidelity scores across different types of interventions (known group method) is an approach to test the validity of the fidelity measurement tool and will show whether the measure differentiates the target intervention from other interventions. Third, convergent validity can be tested by examining the agreement between 2 different sources of information about an intervention. For example, in the SOLAS study, researchers examined agreement between self-report by therapists, direct observation, and audio-record review by a researcher.24
Finally, after the fidelity measurement tool is refined and utilized, the key ingredients of the intervention can be verified (Figure). Active ingredients of an intervention are identified after comparison with another intervention is complete, and outcomes are compared. Program differentiation of the 2 interventions will allow confirmation of the clear differences between interventions that can then be directly linked to improved outcomes and contribute to clear translation of the active ingredients of the intervention.
Use of Fidelity Measurement in Clinical Practice
Successful implementation of a new intervention depends on an adequate support system, which can be guided by an intervention-specific fidelity measure. To optimize and standardize implementation of a new intervention, researchers and clinicians may use fidelity measurement to develop therapist supports, including manuals, training, monitoring on-site or with video, or direct supervision. A detailed manual and training can prepare therapists for the provision of new intervention strategies and improve their confidence in implementing a new intervention.
The adequacy of training and competence of implementation can also be evaluated by assessing intervention fidelity (eg, whether they meet the predetermined fidelity criteria) during and after the training process, which ensures therapists receive adequate training to deliver the intervention as designed. Ongoing support and monitoring can prevent deviations from the intended intervention or, if any deviation occurs, address and correct it immediately, which leads to a high degree of fidelity and consequently improves outcomes. Even if the support system is unavailable or only partially available (eg, therapists from different locations trying to replicate the intervention), fidelity measurement describing key components of the intervention can help them to apply the intervention. In short, use of fidelity measurement can not only facilitate successful implementation and better linkage to the outcomes of intervention in a research project, but fidelity measurement can also enhance the successful translation of evidence-based intervention into practice.
Conclusion
This article supports the importance of using a multi-dimensional fidelity measure in comparison of intervention research and describes the process of developing an intervention-specific fidelity measure to examine the effectiveness and efficacy of a new intervention. Detailed description and adequate measurement of an intervention are essential to ensure internal and external validity of rehabilitation research. Without adequate measurement of fidelity to intervention, there is no way to determine whether an unsuccessful outcome is due to an ineffective intervention or to poor implementation of the intervention. In addition, replication of the intervention cannot be done without fidelity measurement. The descriptions and examples provide a springboard to addressing fidelity measurement in physical therapy research and facilitate its uptake in rehabilitation practice.
Author Contributions and Acknowledgments
Concept/idea/research design: M. An, S.C. Dusing, R.T. Harbourne, S.M. Sheridan
Writing: M. An, S.C. Dusing, R.T. Harbourne
Data collection: S.C. Dusing, R.T. Harbourne
Data analysis: S.C. Dusing, R.T. Harbourne
Project management: S.C. Dusing, R.T. Harbourne
Fund procurement: S.C. Dusing, R.T. Harbourne
Providing participants: S.C. Dusing, R.T. Harbourne
Providing facilities/equipment: S.C. Dusing, R.T. Harbourne
Consultation (including review of manuscript before submitting): S.M. Sheridan
The authors acknowledge the members of the START-Play Consortium: Duquesne University (R.T. Harbourne, PT, PhD, FAPTA; H.-J. Chang, PT, PhD; M. An, PT, PhD), Virginia Commonwealth University (S.C. Dusing, PT, PhD; E. Marcinowski, PhD), University of Delaware (M.A. Lobo, PT, PhD; A.B. Cunha, PT, PhD; I. Babik, PhD), University of Washington (S.W. McCoy, PT, PhD, FAPTA; L-Y. Hsu, PT, PhD; M. Smersh), University of Nebraska-Lincoln (J. Bovaird, PhD; N. Koziol, PhD; S. Sheridan, PhD), and University of Nebraska Medical Center (S.L. Willett, PT, MS).
Funding
Institute of Education Sciences, US Department of Education, grant # R324A150103, Efficacy of the START-Play Program for Infants with Neuromotor Disorders.
Disclosures
The authors completed the ICMJE Form for Disclosure of Potential Conflicts of Interest and reported no conflicts of interest.
References
1
Gresham FM Gansle KA Noell GH
Treatment integrity in applied behavior analysis with children
.
J Appl Behav Anal.
1993
;
26
:
257
–
263
.
2
Moncher FJ Prinz RJ
Treatment fidelity in outcome studies
.
Clin Psychol Rev.
1991
;
11
:
247
–
266
.
3
Dane AV Schneider BH
Program integrity in primary and early secondary prevention: are implementation effects out of control?
Clin Psychol Rev.
1998
;
18
:
23
–
45
.
4
Gresham FM
Assessment of treatment integrity in school consultation and prereferral intervention
.
School Psych Rev.
1989
;
18
:
37
–
50
.
5
Salamh P Cook C Reiman MP Sheets C
Treatment effectiveness and fidelity of manual therapy to the knee: a systematic review and meta-analysis
.
Musculoskeletal Care.
2017
;
15
:
238
–
248
.
6
Walker MF Hoffmann TC Brady MC
Improving the development, monitoring and reporting of stroke rehabilitation research: consensus-based core recommendations from the stroke recovery and rehabilitation roundtable
.
Neurorehabil Neural Repair.
2017
;
31
:
877
–
884
.
7
Yamato TP Maher CG Saragiotto BT Hoffmann TC Moseley AM
How completely are physiotherapy interventions described in reports of randomised trials?
Physiotherapy.
2016
;
102
:
121
–
126
.
8
Quinn L Trubey R Gobat N
Development and delivery of a physical activity intervention for people with Huntington disease: facilitating translation to clinical practice
.
J Neurol Phys Ther.
2016
;
40
:
71
–
80
.
9
Toomey E Matthews J Guerin S Hurley DA
Development of a feasible implementation fidelity protocol within a complex physical therapy–led self-management intervention
.
Phys Ther.
2016
;
96
:
1287
–
1298
.
10
Hoffmann TC Glasziou PP Boutron I
Better reporting of interventions: Template for Intervention Description and Replication (TIDieR) checklist and guide
.
BMJ.
2014
;
348
:
g1687
.
11
Vaughan-Graham J Cott C Wright FV
The Bobath (NDT) concept in adult neurological rehabilitation: what is the state of the knowledge? A scoping review. Part I: conceptual perspectives
.
Disabil Rehabil.
2015
;
37
:
1793
–
1807
.
12
Morgan C Darrah J Gordon AM
Effectiveness of motor interventions in infants with cerebral palsy: a systematic review
.
Dev Med Child Neurol.
2016
;
58
:
900
–
909
.
13
Dobson D Cook TJ
Avoiding type III error in program evaluation. Results from a field experiment
.
Eval Program Plann.
1980
;
3
:
269
–
276
.
14
Yamatani H Mann A Feit M
Avoiding type III, IV, and V errors through collaborative research
.
J Evid Based Soc Work.
2013
;
10
:
358
–
364
.
15
Durlak JA DuPre EP
Implementation matters: a review of research on the influence of implementation on program outcomes and the factors affecting implementation
.
Am J Community Psychol.
2008
;
41
:
327
–
350
.
16
Abry T Hulleman CS Rimm-Kaufman SE
Using indices of fidelity to intervention core components to identify program active ingredients
.
Am J Eval.
2015
;
36
:
320
–
338
.
17
Hulleman CS Cordray DS
Moving from the lab to the field: the role of fidelity and achieved relative intervention strength
.
J Res Educ Eff.
2009
;
2
:
88
–
110
.
OpenURL Placeholder Text
18
Knoche LL Sheridan SM Edwards CP Osborn AQ
Implementation of a relationship-based school readiness intervention: a multidimensional approach to fidelity measurement for early childhood
.
Early Child Res Q.
2010
;
25
:
299
–
313
.
19
Rezze BD Law M Eva K Pollock N Gorter JW
Development of a generic fidelity measure for rehabilitation intervention research for children with physical disabilities
.
Dev Med Child Neurol.
2013
;
55
:
737
–
744
.
20
Breitenstein SM Gross D Garvey CA Hill C Fogg L Resnick B
Implementation fidelity in community-based interventions
.
Res Nurs Health.
2010
;
33
:
164
–
173
.
21
Dusenbury L Brannigan R Falco M Hansen WB
A review of research on fidelity of implementation: implications for drug abuse prevention in school settings
.
Health Educ Res.
2003
;
18
:
237
–
256
.
22
O’Donnell CL
Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K–12 curriculum intervention research
.
Rev Educ Res.
2008
;
78
:
33
–
84
.
23
Mowbray CT Holter MC Teague GB Bybee D
Fidelity criteria: development, measurement, and validation
.
Am J Eval.
2003
;
24
:
315
–
340
.
24
Toomey E Matthews J Hurley DA
Using mixed methods to assess fidelity of delivery and its influencing factors in a complex self-management intervention for people with osteoarthritis and low back pain
.
BMJ open.
2017
;
7
:e015452.
OpenURL Placeholder Text
25
McGrew JH Bond GR Dietzen L Salyers M
Measuring the fidelity of implementation of a mental health program model
.
J Consult Clin Psychol.
1994
;
62
:
670
–
678
.
26
Sheridan SM Swanger-Gagné M Welch GW Kwon K Garbacz SA
Fidelity measurement in consultation: psychometric issues and preliminary examination
.
School Psych Rev.
2009
;
38
:
476
–
495
.
OpenURL Placeholder Text
27
Sheridan SM Marvin C Knoche L Edwards CP
Getting ready: promoting school readiness through a relationship-based partnership model
.
Early Childhood Services.
2008
;
2
:
21
–
45
.
OpenURL Placeholder Text
28
Borrelli B Sepinwall D Ernst D
A new tool to assess treatment fidelity and evaluation of treatment fidelity across 10 years of health behavior research
.
J Consult Clin Psychol.
2005
;
73
:
852
.
29
Garbacz LL Brown DM Spee GA Polo AJ Budd KS
Establishing treatment fidelity in evidence-based parent training programs for externalizing disorders in children and adolescents
.
Clin Child Fam Psychol Rev.
2014
;
17
:
230
–
247
.
30
Perepletchikova F Kazdin AE
Treatment integrity and therapeutic change: issues and research recommendations
.
Clin Psychol Sci Pr.
2005
;
12
:
365
–
383
.
© 2020 American Physical Therapy Association
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Subject
Issue Section:
Comments
0 Comments
Comments (0)
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.
Advertisem*nt
Citations
Views
15,918
Altmetric
More metrics information
Metrics
Total Views 15,918
13,268 Pageviews
2,650 PDF Downloads
Since 1/1/2020
Month: | Total Views: |
---|---|
January 2020 | 46 |
February 2020 | 19 |
March 2020 | 16 |
April 2020 | 18 |
May 2020 | 141 |
June 2020 | 62 |
July 2020 | 30 |
August 2020 | 35 |
September 2020 | 25 |
October 2020 | 18 |
November 2020 | 24 |
December 2020 | 14 |
January 2021 | 41 |
February 2021 | 75 |
March 2021 | 102 |
April 2021 | 112 |
May 2021 | 98 |
June 2021 | 220 |
July 2021 | 224 |
August 2021 | 194 |
September 2021 | 254 |
October 2021 | 297 |
November 2021 | 361 |
December 2021 | 270 |
January 2022 | 267 |
February 2022 | 495 |
March 2022 | 509 |
April 2022 | 504 |
May 2022 | 621 |
June 2022 | 463 |
July 2022 | 301 |
August 2022 | 315 |
September 2022 | 333 |
October 2022 | 545 |
November 2022 | 550 |
December 2022 | 460 |
January 2023 | 547 |
February 2023 | 637 |
March 2023 | 731 |
April 2023 | 818 |
May 2023 | 813 |
June 2023 | 606 |
July 2023 | 550 |
August 2023 | 592 |
September 2023 | 648 |
October 2023 | 620 |
November 2023 | 695 |
December 2023 | 430 |
January 2024 | 172 |
Email alerts
Article activity alert
Advance article alerts
New issue alert
In progress issue alert
Subject alert
Receive exclusive offers and updates from Oxford Academic
Citing articles via
-
Latest
-
Most Read
-
Most Cited
More from Oxford Academic
Advertisem*nt