What Really Works in Intervention? Using Fidelity Measures to Support Optimal Outcomes (2024)

Article Navigation

Volume 100

Issue 5

May 2020

Article Contents

Abstract
Importance of Measuring Intervention Fidelity
Five Dimensions of Fidelity
Measurement of Individual Dimensions of Fidelity
Initial Development of an Intervention-Specific Fidelity Measure
Use of Fidelity Measurement in Clinical Practice
Conclusion
Author Contributions and Acknowledgments
Funding
Disclosures
References

Journal Article

Mihee An

Department of Physical Therapy

, Duquesne University, Pittsburgh, Pennsylvania

Search for other works by this author on:

Oxford Academic

Google Scholar

Stacey C Dusing

Department of Physical Therapy

, Virginia Commonwealth University, Richmond, Virginia

Search for other works by this author on:

Oxford Academic

Google Scholar

Regina T Harbourne

FAPTA

, Department of Physical Therapy, Duquesne University, 600 Forbes Avenue, Pittsburgh, PA 15282 (USA)

Dr Harbourne is a board-certified clinical specialist in pediatric physical therapy. Address all correspondence to Dr Harbourne at: harbourner@duq.edu.

Search for other works by this author on:

Oxford Academic

Google Scholar

Susan M Sheridan

Nebraska Center for Research on Children

, Youth, Families and Schools, University of Nebraska-Lincoln, Lincoln, Nebraska

Search for other works by this author on:

Oxford Academic

Google Scholar

START-Play Consortium

Search for other works by this author on:

Oxford Academic

Google Scholar

Physical Therapy, Volume 100, Issue 5, May 2020, Pages 757–765, https://doi.org/10.1093/ptj/pzaa006

Published:

16 January 2020

Article history

Revision requested:

28 February 2019

Revision received:

01 July 2019

Accepted:

06 October 2019

Received:

16 January 2020

Published:

16 January 2020

PDF
Split View
Views
- Article contents
- Figures & tables
- Video
- Audio
- Supplementary Data
Cite

Cite

Mihee An, Stacey C Dusing, Regina T Harbourne, Susan M Sheridan, START-Play Consortium, What Really Works in Intervention? Using Fidelity Measures to Support Optimal Outcomes, Physical Therapy, Volume 100, Issue 5, May 2020, Pages 757–765, https://doi.org/10.1093/ptj/pzaa006

Close
Permissions Icon Permissions

Navbar Search Filter Mobile Enter search term Search

Navbar Search Filter Enter search term Search

Advanced Search

Search Menu

Abstract

A critical factor to move the field of physical therapy forward is the measurement of fidelity during comparisons of interventions. Fidelity translates as “faithfulness”; thus, fidelity of intervention means faithful and correct implementation of the key components of a defined intervention. Fidelity measurement guards against deviations from, or drift in, the delivery of a targeted intervention, a process necessary for evaluating the efficacy of rehabilitation approaches. Importantly, attention to fidelity measurement differentiates rehabilitation approaches from each other. However, earlier research comparing physical therapist interventions often reported findings without careful attention to fidelity measurement. The purpose of this paper is 2-fold: (1) to support the development of intervention-specific fidelity measures in physical therapy research as the gold standard for translating research findings to clinical practice, and (2) to describe the process of creating a multi-dimensional fidelity measurement instrument in rehabilitation intervention. Improved attention to fidelity measurement will allow the rehabilitation field to communicate interventions clearly with a direct link to outcomes and target the implementation of our improved intervention for the right patient problem with the right dose and the right ingredients at the right time.

Intervention fidelity refers to the degree to which a specific intervention is implemented as intended,1^,2 critically supporting effectiveness research. Fidelity measurement underlies the reliable examination of the effect of interventions, particularly when 2 or more interventions are compared.3^,4 When comparing treatments, fidelity measurement ensures the accurate presentation and examination of a rehabilitation approach and prevents potentially false conclusions. For example, imagine a clinical research study that compares 2 interventions (conventional intervention A and new intervention B) with a hypothesis that the new intervention B will be more effective than intervention A on a specific outcome. If intervention B is implemented poorly (eg, the interventionists do not implement some of the key components), there may not be a clear difference between the interventions. Upon completion of the intervention, when the outcomes between groups are compared, there may be no difference between the 2 groups. In this example, the lack of support for the new intervention B may be because it is not as effective as expected; alternatively, it may be due to inadequate implementation of intervention B or the implementation of something different from either A or B. It is impossible to determine poor effectiveness versus poor fidelity of implementing the intervention unless fidelity measurement occurred during the study. In addition, if other researchers replicate the research but maintain high fidelity of intervention B, intervention B may result in stronger outcomes than intervention A. In the end, the 2 studies present opposing results, making it difficult to draw a conclusion about the effectiveness of the new intervention and difficult to translate these competing results to clinicians.

Intervention fidelity has been addressed in psychological and health behavioral research, but only recently have studies of intervention using fidelity measurement in physical therapy been published.5–7 Although there are a few studies that have developed and used a fidelity measure in physical therapy,8^,9 attention to fidelity measurement is still very limited. A recent overview extracted a random sample of 200 reports of randomized controlled trials from the Physiotherapy Evidence Database and evaluated completeness of the descriptions of physical therapist interventions using the Template for Intervention Description and Replication checklist.10 For the intervention groups, 23% of the trials were incompletely described according to the TIDier guideline, and 75% of trials incompletely described the control group intervention, indicating incomplete fidelity of the majority intervention practices in physical therapy randomized controlled trials. Most studies investigating interventions for stroke and osteoarthritis did not clearly describe how interventions were developed or monitored during the trial.5^,6 Recent reviews of both adult neurologic11 and pediatric interventions12 reported serious methodologic concerns of most studies due to a lack of fidelity measurement, thus reducing the impact of the conclusions that could be drawn from the studies. Attention to fidelity measurement during comparisons of interventions is vital to optimize rehabilitation research and evidence-based clinical care, but it is currently lacking.

The purpose of this perspective is to support the importance of using a unique, multi-dimensional fidelity measure for any comparison or description of interventions to advance rehabilitation effectiveness and efficacy. Additionally, we describe in detail the process of creating an intervention-specific fidelity measurement tool to examine a new intervention.

Importance of Measuring Intervention Fidelity

Adequate fidelity measurement and reporting of intervention fidelity improves the interpretability of the outcome data in research studies as well as the replicability of the intervention, thus easing clinical translation. Fidelity measurement also adds to the rigor of the analysis when statistically comparing groups and adds to the internal validity of the research project. Although researchers clearly understand the dangers of Type I error (a false-positive result or incorrectly rejecting a null hypothesis that is true) and Type II error (false-negative result or failing to reject the null hypothesis when it is false), another important type of error (Type III) has received little attention. Type III error has been described as getting the right answer for the wrong reason.13 Regarding the relationship to fidelity measurement, when a mismatch occurs between the intended intervention and the delivered intervention, researchers end up evaluating something that did not occur, therefore confounding the actual research question (Type III error).13^,14 When a Type III error occurs, research findings may provide misleading information. For instance, stakeholders may decide to stop effective programs based on negative results that derive from poor implementation (due to inadequate fidelity of the planned intervention) rather than an ineffective intervention. Without the proper evaluation of intervention fidelity, researchers and clinicians cannot determine whether the lack of impact is due to the intervention itself (eg, poorly conceptualized, wrong theory, or invalid intervention) or poor translation of the intervention to the implementation phase.

Higher levels of intervention fidelity have been shown to contribute to better intervention outcomes.3^,15 For example, a recent review evaluating data from more than 500 studies found that when interventions are well implemented (with high fidelity), the magnitude of mean effect sizes are at least 2 to 3 times higher than when low fidelity of intervention exists.15 When fidelity measurement of a specific intervention occurs, that quantity can be used as a moderator in the statistical examination of treatment effects, thereby revealing that the intervention is differentially effective under varying levels of intervention fidelity.16^,17 Thus, support for the inclusion of fidelity measures when examining intervention is strong.

As the importance of intervention fidelity has been established, attention to creating strong fidelity measurement has increased. Intervention fidelity can be assessed by either generic measures or intervention-specific measures.9^,18^,19 Generic fidelity measures evaluate general attributes of intervention that are essential for any intervention (eg, therapist-family rapport) and strategies that are commonly used across similar interventions.19^,20 Although generic fidelity measures can be broadly applied to theoretically similar interventions, they are insufficient to capture the unique components of an intervention that distinguish it from other interventions. Intervention-specific fidelity measures evaluate key components of a particular intervention and allow for replicating efficacy or effectiveness studies on the intervention. It is particularly important to develop and use a well-designed, intervention-specific fidelity measure that reflects the key components of that intervention when a new intervention is examined or compared with existing practice. Below, we describe the dimensions of fidelity measurement needed to create an intervention-specific instrument.

Five Dimensions of Fidelity

Intervention fidelity is a multidimensional, measurable construct with both quantitative and qualitative dimensions. Fidelity is commonly described with 5 dimensions amenable to measurement: adherence, dosage, quality of intervention delivery, participant responsiveness, and program differentiation (Figure).3^,21^,22Adherence measures the adequate delivery of the key components of an intervention; in other words, delivery of the intervention as it was designed. Dosage describes the amount of a specific intervention delivered, often measured in terms of time. Quality of intervention delivery is the way interventionists deliver the intervention using overall processes or strategies as prescribed by developers. Participant responsiveness indicates the extent to which participants respond to or are engaged by the intervention and is another overall qualitative judgement. Lastly, program differentiation involves the extent to which the components and processes of the intervention being studied differ from other interventions (eg, in a comparison of interventions study). Monitoring of the comparison group is important for program differentiation because the compared (often the standard care) intervention may contain some of the same key components as the target intervention. Program differentiation can be key to identifying active ingredients that are essential for successful intervention. Below, we further examine each fidelity dimension and the way in which it can be measured.

Figure

Open in new tab Download slide

Multi-dimensional construct of intervention fidelity.

Table

Example Approaches to Measure Multiple Dimensions of Intervention Fidelitya

	SOLAS Study 9 ^, 24	Getting Ready Study 18
Methods of data collection	Checklist by self-report, direct observation, and audio recording	Behavioral coding by observation of video recording
Fidelity measurement for comparison group	N (no comparison group)	Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group)
Measurement of each fidelity dimension
Adherence	25 components for each of 6 weekly sessions were identified	11 intervention strategies were identified
	Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1)	Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure)
	Overall adherence score was computed by summing scores of all components	Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use)
Dosage	Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min	Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming
Quality of intervention delivery	Not measured	Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent
Participant responsiveness	Not measured	Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions)
Program differentiation	Not measured	Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness)

	SOLAS Study 9 ^, 24	Getting Ready Study 18
Methods of data collection	Checklist by self-report, direct observation, and audio recording	Behavioral coding by observation of video recording
Fidelity measurement for comparison group	N (no comparison group)	Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group)
Measurement of each fidelity dimension
Adherence	25 components for each of 6 weekly sessions were identified	11 intervention strategies were identified
	Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1)	Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure)
	Overall adherence score was computed by summing scores of all components	Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use)
Dosage	Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min	Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming
Quality of intervention delivery	Not measured	Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent
Participant responsiveness	Not measured	Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions)
Program differentiation	Not measured	Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness)

Getting Ready = relationship-based school readiness intervention for children from birth to age 5; SOLAS = Self-management of Osteoarthritis and Low back pain through Activity and Skills.

Open in new tab

Table

Example Approaches to Measure Multiple Dimensions of Intervention Fidelitya

	SOLAS Study 9 ^, 24	Getting Ready Study 18
Methods of data collection	Checklist by self-report, direct observation, and audio recording	Behavioral coding by observation of video recording
Fidelity measurement for comparison group	N (no comparison group)	Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group)
Measurement of each fidelity dimension
Adherence	25 components for each of 6 weekly sessions were identified	11 intervention strategies were identified
	Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1)	Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure)
	Overall adherence score was computed by summing scores of all components	Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use)
Dosage	Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min	Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming
Quality of intervention delivery	Not measured	Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent
Participant responsiveness	Not measured	Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions)
Program differentiation	Not measured	Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness)

	SOLAS Study 9 ^, 24	Getting Ready Study 18
Methods of data collection	Checklist by self-report, direct observation, and audio recording	Behavioral coding by observation of video recording
Fidelity measurement for comparison group	N (no comparison group)	Y (Getting Ready intervention in experimental group vs typical early intervention in comparison group)
Measurement of each fidelity dimension
Adherence	25 components for each of 6 weekly sessions were identified	11 intervention strategies were identified
	Individual components rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1)	Individual strategy use was coded as present if observed at all during the 1-min interval (1-min partial-interval recording procedure)
	Overall adherence score was computed by summing scores of all components	Overall adherence was determined by proportion of intervals in which each strategy was used (individual strategy use) and sum of proportion of individual strategy use (total strategy use)
Dosage	Duration of each session (education + exercise components) was documented Actual duration of exercise component was compared with its intended duration of 45min	Defined as number of sessions completed, but not included in the data analysis because it was consistent across all professionals in both groups as a function of school readiness programming
Quality of intervention delivery	Not measured	Professionals’ effectiveness in providing Getting Ready intervention strategies was rated on a scale of 1 to 4. 1 = Professional does not encourage/invite parental participation; is entirely focused on child and ineffective in initiating conversations with parent 4 = Professional provides ample opportunities for collaboration and initiates meaningful conversation with parent; is focused on parent–child relationship and completely effective in initiating conversations and discussions with parent
Participant responsiveness	Not measured	Parental level of interest and engagement with professional were rated on global scale of 1 to 4. 1 = Parent does not indicate interest in material or activities presented by professional; parent participation is more passive and limited 4 = Parent displays much interest in or initiates activities with professional and participates in bidirectional discussions; parent’s participation is active (eg, initiates and elaborates on topics of discussions)
Program differentiation	Not measured	Multiple variables were compared between experimental and comparison groups including: • Proportion of Getting Ready strategy use (adherence) • Ratings of professional’s effectiveness (quality of intervention delivery) • Ratings of parental interest/engagement (participant responsiveness)

Getting Ready = relationship-based school readiness intervention for children from birth to age 5; SOLAS = Self-management of Osteoarthritis and Low back pain through Activity and Skills.

Open in new tab

Measurement of Individual Dimensions of Fidelity

Although intervention fidelity optimally includes 5 dimensions, only 1 or 2 dimensions of fidelity (eg, adherence, dosage) have been measured in previous intervention research.3^,21 There is no study measuring all 5 dimensions of intervention fidelity in physical therapy research. Because each dimension is an important aspect of fidelity, optimal fidelity measurement tools should be multi-dimensional. The 5 dimensions of fidelity together measure both structure (eg, adherence and dosage) and process (eg, quality of delivery).22^,23 Fidelity to structure measures both the content and the amount of the intervention. Fidelity to process captures the way in which the intervention is delivered and involves interactions between therapist and patient/family. Fidelity measurement of process requires qualitative judgment, which may be more difficult to measure reliably but may be significant in terms of intervention effects.23 To capture a comprehensive picture of intervention fidelity and consequently produce an accurate examination of the effectiveness of an intervention, measurement of all 5 dimensions of fidelity is recommended.3^,21 However, evaluating all 5 dimensions is not always applicable or practical. Thus, it falls to the researcher to develop a unique fidelity measure that is relevant to their research question and to the outcomes of the specific intervention, with appropriate selection of the 5 dimensions as needed for the study. Measurement approaches for each of the 5 dimensions of fidelity with 2 example studies are presented below and in the Table.

Adherence

Adherence can be quantified based on whether each key component of an intervention is present or absent. If a key component only needs to occur once, then a single score of present or absent is adequate. For instance, in research examining the feasibility of a group-based complex behavioral intervention to promote self-management for people with osteoarthritis or chronic low back pain (SOLAS study), researchers developed an intervention fidelity protocol and checklists consisting of approximately 25 components for each of 6 weekly sessions. Each component was rated as yes/present (a score of 2), no/absent (a score of 0), or attempted (a score of 1), and the overall adherence score was computed by summing the scores of all components.9^,24 Another option, if more detail is needed, is to use the percent of occurrence of each component. For example, in research examining the efficacy of the Getting Ready intervention (promoting school readiness through enhancing parent engagement), researchers defined 11 strategies of the Getting Ready intervention and recorded interventionists’ use of individual strategies using 1-minute partial-interval recording procedures.18 Use of each strategy was coded as present if it was observed at all during 1-minute intervals; then, the proportion of time (percent of the session) in which each strategy was used was calculated. These 2 examples both show measures of adherence; the SOLAS study used counts, and the Getting Ready study used percentages. Both methods provided quantification of the adherence dimension of fidelity.

Dosage

Measurement of dosage must be defined and can be quantified by using number of sessions completed, overall duration of the intervention, or time spent implementing key components of the intervention as designed. Measurement of dosage supplies crucial information such as completeness of delivery and the difference in dosage between groups. For example, in the SOLAS study,24 the researchers documented the duration of each session and compared the actual duration of the exercise component with its intended duration of 45minutes (as a measurement of completeness of delivery).24 In the Getting Ready study,18 dosage was defined as the number of home visit sessions completed.

Quality of Intervention Delivery and Participant Responsiveness

Gestalt variables are suitable for evaluating fidelity to intervention process dimensions (eg, quality of intervention delivery and participant responsiveness). To use gestalt items (eg, overall quality ratings), criteria for the indicators should be clearly defined based on the theory or principles underlying the intervention. For example, early intervention and pediatric rehabilitation programs based on family-centered principles include interactive activities to facilitate family engagement and collaborative intervention. Therefore, the quality of interaction and the degree to which interactive activities focus on the key components are important to measure. In the Getting Ready intervention study,18 the researchers used global rating scales with scores ranging from 1 (low) to 4 (high) to measure quality of intervention delivery and participant responsiveness (Table).18 As a measure of the quality of intervention delivery, the overall effectiveness of the early childhood professional was rated based on the professional’s ability to provide opportunities for collaboration, to initiate meaningful conversations and discussions with the parent, and to focus on the parent–child relationship. For participants’ responsiveness, the researchers rated parental level of interest and engagement with the professional (eg, parent engagement in discussion and brainstorming and active participation in activities) and with their child (eg, engaging in and starting meaningful conversations and interactions with their child).

Program Differentiation

Lastly, to measure program differentiation, researchers may use data from the other 4 dimensions of fidelity to determine differences between interventions. For example, researchers may measure therapists in both the experimental and standard care groups to see whether they implement key components of the assigned intervention (adherence), and with what amount (dosage) and in what way (quality of delivery, participant responsiveness). In the Getting Ready study,18 program differentiation was determined based on a combination of the rate of strategy implementation (adherence), ratings of professionals’ effectiveness in providing the Getting Ready intervention strategies (quality of intervention delivery), and ratings of parental engagement (participants’ responsiveness) in experimental and comparison (typical early intervention) groups. Measuring intervention fidelity in both groups allowed the researchers to identify components of the Getting Ready intervention that were also used in the standard care group and verify the key components of the Getting Ready intervention that were clearly different from typical early intervention. They found that some strategies were implemented with similar rates in both groups, showing they are generic strategies of any early intervention programs; some strategies were implemented with significantly higher rates in the experimental group, indicating they were unique to the Getting Ready intervention.

Initial Development of an Intervention-Specific Fidelity Measure

Creating an intervention-specific fidelity tool is generally a 3-step process.23 The first, identifying possible indicators or key components of the approach by either expert consensus or previous research, is critical; this step is often repeated at the end of the process to assure the inclusion of all key components. The second step is to establish a measurement system, which involves decisions about how to measure the key components (eg, direct observation, coding video, or use of products such as written notes from the intervention) and how to determine if the intervention is implemented with acceptable fidelity. The final step is to examine reliability and validity of the fidelity measurement instrument. One practical component of developing the fidelity measurement instrument is the cost of time and money. Fidelity measurement adds cost to any study, which should be considered during the steps of the design stage to adequately plan for inclusion of manpower and for budgetary reasons when requesting funding.

Identifying Key Components

The first step in developing a fidelity measure is the identification of key components of an intervention.25^,26 A new intervention, or the intervention of interest in any training or comparison of effectiveness research, presents with particular claims regarding the content, process, training requirements, and potential outcome of that intervention. Ideally, the successful intervention components have been identified through prior investigations. In actual practice, therapists often “telegraph” the content of an intervention using shorthand (eg, a functional approach or a top-down approach). Successful development of a fidelity tool requires that an intervention be reduced to several key components that differentiate the intervention from other, possibly similar approaches for the same clinical problem. These key components of the intervention should fully define what the intervention is targeting and the portions of the intervention that, if removed, would change the intervention as a whole. Thus, the exploratory phase of developing a fidelity tool involves discussions with experts in interventions used for the clinical problem being addressed and examination of examples of the intervention of interest as well as counter-examples and the use of evidence-based information. Because rehabilitation is a dynamic and interactive endeavor, many interventions will have common components, such as patient/therapist discussions, physical contact between patient and therapist, and movement practice. The development of the key component list must include the behaviors of the therapist, which can be observed within the course of the intervention and which fit within the theory driving the intervention. As part of the process of selecting key components, observation of videos of both the new and the standard-of-care interventions as well as brain-storming and critically discussing each component helps contribute to identification, naming, and definition of the items.

Critical to the process of identifying key components of an intervention is the understanding of the theory underlying the intervention of interest. For example, the Getting Ready intervention is based on the important role that parents play in preschool readiness.18^,27 The Getting Ready intervention focuses on promoting parent-child interactions and parents’ competence and confidence in maximizing children’s natural learning opportunities through collaborative partnerships between parents and professionals, which is different from other EI programs focusing on skills a child fails to perform. Therefore, the key components of the Getting Ready intervention include collaborative strategies used by professionals to support parents (eg, asking for and building on parental observations of child development, offering affirmation, encouraging parents to discuss and prioritize concerns and needs for their child).

Establishing the Measurement System

Once the key components of the intervention are defined, the second step in development of the fidelity measurement tool is to determine how to quantify the key components as well as other dimensions of fidelity. Self-report by participants and ratings by experts or trained raters are common methods to assess fidelity. For self-report measures, key components or steps comprising an intervention are listed, and participants document if each of the components is completed. Self-report measures can be simple and less reliant on resources (eg, human, time, material) but may result in overestimation of fidelity.22 Observation (direct observation or watching videos of intervention) involves trained and reliable individuals, ideally independent from the research, documenting whether each of key components occurred. It is more objective but more resource intensive than self-report and less commonly used in intervention studies. Using independent observers can add considerable cost to a study and needs careful calculation to add to budget estimates. As each method has advantages and disadvantages, it falls to developers to determine a measurement method that is appropriate and sensitive to capturing the key components of an intervention. In some cases, more than one approach to measurement can be used to fully measure intervention fidelity (eg, video observation by independent raters to measure adherence and quality of intervention delivery, self-report by patient/family to measure participant responsiveness, chart review for attendance rate as a measurement of dosage).

Developers also need to establish criteria to determine if the intervention is delivered with an acceptable level of fidelity. Developers may use previously reported levels of fidelity or establish their own criteria for an intervention. For example, self-report by therapists was used to measure fidelity in the SOLAS study24 and was based on previously reported criteria; 80% to 100% adherence to the fidelity checklist (ie, >80% of items on the checklist were implemented) was interpreted as high fidelity, 51% to 79% as moderate, and 50% or below as low fidelity.28–30 In the Getting Ready intervention study, no specific predetermined levels of fidelity to the Getting Ready intervention were indicated. Instead, as a first approximation, the researchers defined good fidelity to be total strategy use of 50% (ie, observed by independent rater in over one-half of the intervals).18

To establish criteria for a new intervention, developers may choose to determine a likely upper limit of fidelity by scoring sessions that the team and experts deem to be good examples of the target intervention. Subsequently, sessions of the comparison group can be scored (eg, standard intervention) as well as sessions that barely implement the target intervention. With these 3 data points (from well implemented to barely implemented intervention to standard intervention), the range of fidelity scores can determine an acceptable range of fidelity (eg, acceptable if each of the key components of the intervention is implemented above a certain proportion/frequency) or levels of fidelity (eg, high, moderate, low fidelity). The criteria may also be different depending on the type of research or practice. For example, for an efficacy trial, researchers may set very high standards of adherence (eg, adherence should be >80%, quality of delivery should not be <3 in a scale of 1 to 4). If researchers are interested in how intervention fidelity affects the outcomes, they may set levels of fidelity (eg, high, moderate, or low fidelity) and use the fidelity value as a moderator when analyzing outcome data. In an effectiveness trial examining the intervention in a variety of clinical settings, the standards may be lower (eg, adherence >50% is acceptable) because control is reduced, and the goal is to see how the intervention fares when translated to clinical practice; the lower standard of fidelity may be expected when broadening the scope to multiple interventionists, facilities, or locations.

Testing Reliability and Validity

The third step in developing a fidelity measure is to test the measurement system in terms of reliability and validity. There are several ways to test psychometric properties of fidelity measures.23 First, developers can examine reliability of the fidelity measurement tool across respondents (eg, inter-rater reliability on behavioral coding) by calculating percentage of agreement, coefficient kappa, intraclass correlation coefficient, or Pearson correlation coefficients. This testing will demonstrate the clarity of the definitions of key components. One way to improve and clarify the definitions of key components is to have multiple raters observe and code the same intervention session (this can be done by direct observation or video-tape review) and compare coding results. For components on which the raters disagree, raters and developers discuss, refine, and add more examples as needed. This process is repeated until the raters reach an acceptable inter-rater reliability. Second, examining differences in fidelity scores across different types of interventions (known group method) is an approach to test the validity of the fidelity measurement tool and will show whether the measure differentiates the target intervention from other interventions. Third, convergent validity can be tested by examining the agreement between 2 different sources of information about an intervention. For example, in the SOLAS study, researchers examined agreement between self-report by therapists, direct observation, and audio-record review by a researcher.²⁴

Finally, after the fidelity measurement tool is refined and utilized, the key ingredients of the intervention can be verified (Figure). Active ingredients of an intervention are identified after comparison with another intervention is complete, and outcomes are compared. Program differentiation of the 2 interventions will allow confirmation of the clear differences between interventions that can then be directly linked to improved outcomes and contribute to clear translation of the active ingredients of the intervention.

Use of Fidelity Measurement in Clinical Practice

Successful implementation of a new intervention depends on an adequate support system, which can be guided by an intervention-specific fidelity measure. To optimize and standardize implementation of a new intervention, researchers and clinicians may use fidelity measurement to develop therapist supports, including manuals, training, monitoring on-site or with video, or direct supervision. A detailed manual and training can prepare therapists for the provision of new intervention strategies and improve their confidence in implementing a new intervention.

The adequacy of training and competence of implementation can also be evaluated by assessing intervention fidelity (eg, whether they meet the predetermined fidelity criteria) during and after the training process, which ensures therapists receive adequate training to deliver the intervention as designed. Ongoing support and monitoring can prevent deviations from the intended intervention or, if any deviation occurs, address and correct it immediately, which leads to a high degree of fidelity and consequently improves outcomes. Even if the support system is unavailable or only partially available (eg, therapists from different locations trying to replicate the intervention), fidelity measurement describing key components of the intervention can help them to apply the intervention. In short, use of fidelity measurement can not only facilitate successful implementation and better linkage to the outcomes of intervention in a research project, but fidelity measurement can also enhance the successful translation of evidence-based intervention into practice.

Conclusion

This article supports the importance of using a multi-dimensional fidelity measure in comparison of intervention research and describes the process of developing an intervention-specific fidelity measure to examine the effectiveness and efficacy of a new intervention. Detailed description and adequate measurement of an intervention are essential to ensure internal and external validity of rehabilitation research. Without adequate measurement of fidelity to intervention, there is no way to determine whether an unsuccessful outcome is due to an ineffective intervention or to poor implementation of the intervention. In addition, replication of the intervention cannot be done without fidelity measurement. The descriptions and examples provide a springboard to addressing fidelity measurement in physical therapy research and facilitate its uptake in rehabilitation practice.

Author Contributions and Acknowledgments

Concept/idea/research design: M. An, S.C. Dusing, R.T. Harbourne, S.M. Sheridan

Writing: M. An, S.C. Dusing, R.T. Harbourne

Data collection: S.C. Dusing, R.T. Harbourne

Data analysis: S.C. Dusing, R.T. Harbourne

Project management: S.C. Dusing, R.T. Harbourne

Fund procurement: S.C. Dusing, R.T. Harbourne

Providing participants: S.C. Dusing, R.T. Harbourne

Providing facilities/equipment: S.C. Dusing, R.T. Harbourne

Consultation (including review of manuscript before submitting): S.M. Sheridan

The authors acknowledge the members of the START-Play Consortium: Duquesne University (R.T. Harbourne, PT, PhD, FAPTA; H.-J. Chang, PT, PhD; M. An, PT, PhD), Virginia Commonwealth University (S.C. Dusing, PT, PhD; E. Marcinowski, PhD), University of Delaware (M.A. Lobo, PT, PhD; A.B. Cunha, PT, PhD; I. Babik, PhD), University of Washington (S.W. McCoy, PT, PhD, FAPTA; L-Y. Hsu, PT, PhD; M. Smersh), University of Nebraska-Lincoln (J. Bovaird, PhD; N. Koziol, PhD; S. Sheridan, PhD), and University of Nebraska Medical Center (S.L. Willett, PT, MS).

Funding

Institute of Education Sciences, US Department of Education, grant # R324A150103, Efficacy of the START-Play Program for Infants with Neuromotor Disorders.

Disclosures

The authors completed the ICMJE Form for Disclosure of Potential Conflicts of Interest and reported no conflicts of interest.

References

Gresham

Gansle

Noell

Treatment integrity in applied behavior analysis with children

J Appl Behav Anal.

1993

;

257

–

263

Moncher

Prinz

Treatment fidelity in outcome studies

Clin Psychol Rev.

1991

;

247

–

266

Google Scholar

Crossref

Search ADS

Dane

Schneider

Program integrity in primary and early secondary prevention: are implementation effects out of control?

Clin Psychol Rev.

1998

;

–

Gresham

Assessment of treatment integrity in school consultation and prereferral intervention

School Psych Rev.

1989

;

–

Google Scholar

Crossref

Search ADS

Comments

0 Comments

Comments (0)

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Advertisem*nt

Citations

Views

15,918

Altmetric

More metrics information

Metrics

Total Views 15,918

13,268 Pageviews

2,650 PDF Downloads

Since 1/1/2020

Month:	Total Views:
January 2020	46
February 2020	19
March 2020	16
April 2020	18
May 2020	141
June 2020	62
July 2020	30
August 2020	35
September 2020	25
October 2020	18
November 2020	24
December 2020	14
January 2021	41
February 2021	75
March 2021	102
April 2021	112
May 2021	98
June 2021	220
July 2021	224
August 2021	194
September 2021	254
October 2021	297
November 2021	361
December 2021	270
January 2022	267
February 2022	495
March 2022	509
April 2022	504
May 2022	621
June 2022	463
July 2022	301
August 2022	315
September 2022	333
October 2022	545
November 2022	550
December 2022	460
January 2023	547
February 2023	637
March 2023	731
April 2023	818
May 2023	813
June 2023	606
July 2023	550
August 2023	592
September 2023	648
October 2023	620
November 2023	695
December 2023	430
January 2024	172

Citations

23 Web of Science

Altmetrics

Email alerts

Article activity alert

Advance article alerts

New issue alert

In progress issue alert

Subject alert

Receive exclusive offers and updates from Oxford Academic

Citing articles via

Web of Science (23)

Google Scholar

Latest
Most Read
Most Cited

Use of a Home-Based, Commercial Exercise Platform to Remotely Monitor Aerobic Exercise Adherence and Intensity in People With Parkinson Disease

My View of PTJ

News From the Foundation for Physical Therapy Research, January 2024

Differences in Pain Experience Among Different Racial and Ethnic Groups

Identifying Clinical Phenotypes in People Who Are Hispanic/Latino With Chronic Low Back Pain: Use of Sensor-Based Measures of Posture and Movement, Pain, and Psychological Factors

What Really Works in Intervention? Using Fidelity Measures to Support Optimal Outcomes (2024)

Article Contents

Cite

Abstract

Importance of Measuring Intervention Fidelity

Five Dimensions of Fidelity

Measurement of Individual Dimensions of Fidelity

Adherence

Dosage

Quality of Intervention Delivery and Participant Responsiveness

Program Differentiation

Initial Development of an Intervention-Specific Fidelity Measure

Identifying Key Components

Establishing the Measurement System

Testing Reliability and Validity

Use of Fidelity Measurement in Clinical Practice

Conclusion

Author Contributions and Acknowledgments

Funding

Disclosures

References

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited