Copyright © 2007 by The Journal of Bone and Joint Surgery, Inc.

Commentary & Perspective

Commentary & Perspective on
"Quality of Prospective Controlled Randomized Trials. Analysis of Trials of Treatment for Lateral Epicondylitis as an Example"
by James Cowan, BA, et al.

Commentary & Perspective by
Jeffrey N. Katz, MD, MSc, and Elena Losina, PhD*,
Brigham and Women's Hospital, Harvard Medical School, and the Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts

Posted August 2007

Cowan and colleagues report in this issue of The Journal that the great majority of randomized controlled trials involving lateral epicondylitis received poor scores when subjected to standard methods for rating the quality of randomized controlled trials. The authors suggest that we rethink the sacred stature currently accorded to the randomized controlled trial in the medical literature. This is a provocative suggestion. The randomized controlled trial indeed sits atop the hierarchy of study designs, and for good reason. Recall that high-quality observational studies led a generation of women to take hormone replacement therapy to prevent cardiac disease1,2. Then a randomized controlled trial showed that, in fact, hormone replacement therapy raised—not lowered—the risk of cardiac disease3. The observational studies were well performed but incorrect.

Randomized controlled trials were originally introduced after World War II and focused primarily on pharmacologic interventions4. Decades later, the design is applied increasingly to surgery in general and to orthopaedic surgery in particular. Why are orthopaedic trials so useful? They have several critical methodological features, the most crucial of which is randomization. In randomized controlled trials, patients are randomly assigned to treatment groups. Randomization is intended to balance all patient characteristics across the treatment groups. This includes recognized, quantifiable patient features, such as age, sex, comorbidity, and baseline functional status, as well as unmeasured or qualitative predictors of outcome, such as a positive outlook on life and, perhaps, yet unknown genetic factors that are related to treatment outcomes.

To optimize the benefit of randomization and the attendant balanced allocation of prognostic factors, an intent-to-treat approach should be used during the data-analysis stage of randomized controlled trials. This strategy analyzes all patients in the group to which they were randomly assigned, irrespective of whether they actually received the intended treatment. Thus, in a trial that compares conservative therapy with surgery for patients with meniscal disorders, the intention-to-treat analysis would include in the nonoperative arm all patients randomized to nonoperative therapy, even if a large number of them crossed over and had surgery over the course of the trial. The extent of crossover may depend on the complexity and risks associated with surgery. The Spine Patient Outcomes Research Trial (SPORT) provides an important example of the critical impact of crossover in the surgical trials5,6. The SPORT trials for disc disease and for degenerative spondylolisthesis were launched because of the virtual absence of trial data to aid patients in making informed decisions with regard to treatment. Almost half of the patients randomly assigned to surgery in both of these trials never underwent the procedure. Further, half of the patients randomly assigned to nonoperative therapy crossed over to surgery. These crossover subjects dramatically degrade the effective sample size, rendering the randomized portions of the trials uninterpretable. It appears now that the only interpretable information we can glean on comparative treatment effects in SPORT is from the observational portions of the study5.

In observational studies, imbalances in measured characteristics can be addressed with sophisticated statistical adjustment. But we cannot adjust for features that we cannot measure or do not even recognize. Whereas treatments are assigned, simplistically speaking, by the flip of a coin in the randomized controlled trial, patients and their physicians choose the treatments in observational studies. Factors that lead patients and physicians to choose one treatment over another may also affect the outcome and bias the study. For example, it may be that the subjects who took hormone replacement therapy in the observational studies were more health conscious and avoided cardiac risks more effectively than nonusers, resulting in an apparent protective effect of hormone replacement therapy.

Another critical methodological aspect related to randomized controlled trials is blinding. This is the procedure of masking treatment assignments. In a single-blind study, the subject does not know what treatment he or she received. In a double-blind study, neither the subject nor the evaluators on the study team know what treatment any particular subject received. The advantage of blinding is clear: it can reduce observer bias. The disadvantages include the logistical difficulty of achieving blinding in many settings and the ethical problems that may ensue. Surgical trials pose particular challenges in this regard. In drug trials, placebo tablets that look identical to the active agent facilitate blinding. In surgery, the same level of concealment may require a sham procedure. Barriers to implementation of sham surgery include ethical, operational, and financial considerations. For example, Moseley et al. randomized patients with osteoarthritis of the knee to either arthroscopic débridement or a sham procedure7. Lively discussion ensued with regard to whether it is ethical to expose a patient to the small but real risks of sham surgery.

Returning to the studies reviewed by Cowan and colleagues, are the poorly rated trials so flawed as to be uninterpretable? The scoring systems presented in Tables 1 and 2 of the paper by Cowan et al. penalize trials for errors of both commission and omission and for errors in design as well as in reporting. Points are subtracted if the trials did not blind assessors or if there was poor subject retention over the follow-up period. These are serious problems that could threaten validity. On the other hand, many of the quality criteria penalize investigators for failing to report particular features rather than failing to implement them. This is a less serious problem. In fact, many of these trials were published before the development of a standardized reporting format for randomized controlled trials, the use of which is now required by many journals8. Thus, failure to report certain design features may have no bearing on the validity of the trial findings.

Can a well-performed observational study provide the sort of definitive statement that emerges from an excellent trial? Should we focus resources on improving trials or on developing alternate designs? The two goals are compatible with one another; we can and should nurture both trial and cohort methodology. Indeed, some scientific objectives, such as long-term treatment outcomes and risk factors for complications, can best be addressed with longitudinal designs. But if the question is whether Treatment A works better than Treatment B, there is no good substitute for the randomized controlled trial.

Given the unique advantages of randomization and the many unresolved treatment questions in orthopaedics, continued work on improving the surgical randomized controlled trial seems an especially important priority for orthopaedic investigation. Several problems are most pressing, including the difficulty of blinding patients and providers in the surgical setting. Crossover degrades the effective sample size and makes the intent-to-treat analysis difficult to interpret. If the nonoperative group receives a highly heterogeneous set of treatments, it is difficult to characterize the contrast (surgery compared with what?). Negative trials provide further challenges in that they are less likely to be published than positive trials, potentially leading to publication bias. Similarly, trials with modest effect sizes may be regarded as negative if they are underpowered and do not show statistically significant differences. Yet meta-analyses of such trials may clarify true treatment effects. Thus, further work on blinding, limiting crossovers, standardizing nonoperative regimens, encouraging the publication of negative trials, and synthesizing quantitative data across the literature through meta-analysis will be especially fruitful starting points for strengthening the development and impact of orthopaedic randomized controlled trials.

The work to be done cannot be accomplished by single investigators or disciplines. Teams of clinical investigators and methodologists, including biostatisticians, epidemiologists, and surgeons with a sensitivity for both clinical and methodological concepts, are needed to successfully apply randomized controlled trial design to surgical problems. We have the personnel and tools to continue to move this research agenda forward. The many unanswered questions in orthopaedic and musculoskeletal care impel us to do so.

*The author did not receive any outside funding or grants in support of his research for or preparation of this work. Neither he nor a member of his immediate family received payments or other benefits or a commitment or agreement to provide such benefits from a commercial entity. No commercial entity paid or directed, or agreed to pay or direct, any benefits to any research fund, foundation, division, center, clinical practice, or other charitable or nonprofit organization with which the author, or a member of his immediate family, is affiliated or associated.

References

1. Grodstein F, Manson JE, Stampfer MJ. Postmenopausal hormone use and secondary prevention of coronary events in the nurses' health study. A prospective, observational study. Ann Intern Med. 2001;135:1-8.
2. Stampfer MJ, Colditz GA, Willett WC, Manson JE, Rosner B, Speizer FE, Hennekens CH. Postmenopausal estrogen therapy and cardiovascular disease. Ten-year follow-up from the nurses' health study. N Engl J Med. 1991;325:756-62.
3. Manson JE, Hsia J, Johnson KC, Rossouw JE, Assaf AR, Lasser NL, Trevisan M, Black HR, Heckbert SR, Detrano R, Strickland OL, Wong ND, Crouse JR, Stein E, Cushman M; Women's Health Initiative Investigators. Estrogen plus progestin and the risk of coronary heart disease. N Engl J Med. 2003;349:523-34.
4. Medical Research Council, Streptomycin in Tuberculosis Trials Committee. Streptomycin treatment of pulmonary tuberculosis. Br Med J. 1948;2:769-83.
5. Weinstein JN, Lurie JD, Tosteson TD, Hanscom B, Tosteson AN, Blood EA, Birkmeyer NJ, Hilibrand AS, Herkowitz H, Cammisa FP, Albert TJ, Emery SE, Lenke LG, Abdu WA, Longley M, Errico TJ, Hu SS. Surgical versus nonsurgical treatment for lumbar degenerative spondylolisthesis. N Engl J Med. 2007;356:2257-70.
6. Weinstein JN, Tosteson TD, Lurie JD, Tosteson AN, Hanscom B, Skinner JS, Abdu WA, Hilibrand AS, Boden SD, Deyo RA. Surgical vs nonoperative treatment for lumbar disk herniation: the Spine Patient Outcomes Research Trial (SPORT): a randomized trial. JAMA. 2006;296:2441-50.
7. Moseley JB, O'Malley K, Petersen NJ, Menke TJ, Brody BA, Kuykendall DH, Hollingsworth JC, Ashton CM, Wray NP. A controlled trial of arthroscopic surgery for osteoarthritis of the knee. N Engl J Med. 2002;347:81-8.
8. Moher D, Schulz KF, Altman D; CONSORT Group (Consolidated Standards of Reporting Trials). The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. JAMA. 2001;285:1987-91.