This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF) Free
Right arrow Letters to the Editor: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Letters to the Editor are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Rights and Permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernstein, J.
Right arrow Articles by Myers, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bernstein, J.
Right arrow Articles by Myers, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Facebook   Add to Technorati   Add to Twitter  
What's this?
The Journal of Bone and Joint Surgery 80:925-6 (1998)
© 1998 The Journal of Bone and Joint Surgery, Inc.


Correspondence

Correspondence

Joseph Bernstein, M.S., M.D., Robert L. Barrack, M.D., Michael W. Wolfe, M.D., Alexander J. Bertot, M.D., Douglas A. Waldman, M.D., Matko Milicic, M.D. and Leann Myers, Ph.D.

TO THE EDITOR:

"Resurfacing of the Patella in Total Knee Arthroplasty. A Prospective, Randomized, Double-Blind Study" (79-A: 1121–1131, Aug. 1997), by Barrack et al., while excellent, would have been better had the authors reported the statistical power3 of their study. Power quantitates the likelihood of a so-called type-II error—that is, concluding that two groups are similar when they are indeed different.

Examination of the comparative data shows that the group that had patellar resurfacing actually gained more motion and had greater improvements in the pain, function, and overall scores. (It may be said that these were clinically unimportant differences, and they may be, but that is not a statistical argument.) As the probability was more than 0.05 that these distinctions owed more to chance than to true differences—that is, the differences were not significant—Barrack et al. concluded that the outcomes in the two groups were similar. This may be a type-II error. Merely failing to prove that A is different than B is not tantamount to proving that A equals B. It is possible that a more powerful study could have detected significant differences between the groups. A report of power allows the reader to consider explicitly the likelihood of that possibility.

Barrack et al. also noted that, because of the short duration of follow-up, the results were preliminary. This wording may still be insufficiently limiting. There are strong theoretical reasons, as suggested by Buckwalter2, that deterioration in the so-called control group may be lurking just beyond the follow-up period. (Buckwalter himself said that a five-year follow-up period would be reasonable for articular procedures but conceded the impracticality of such a demand.)

In strict terms, the group that did not receive a patellar component was not a control group. Rather, they received a different treatment—namely, abrasion chondroplasty. This operation is known to decrease symptoms in the degenerated knee but only for brief periods4. The operation may stimulate the deposition of repair cartilage (thus providing relief), but this repair tissue, which is histologically distinct from true hyaline cartilage, cannot withstand the mechanical loading of the joint for long and deteriorates with time.

Two small points may also be worth noting. The first is that there was overlap between the preoperative and postoperative scores within both groups. In other words, some patients' postoperative knee scores were lower than other patients' preoperative scores. Given the high rate of patient satisfaction, it may be reasonably inferred that some of the patients who had a low score may, in fact, be considered to have had a clinical success. This phenomenon illustrates that scoring systems serve only as a proxy for patient utility, the true measure of how the patient values the outcome. Thus, it would be incorrect to arbitrarily state that a given preoperative score is an indication for an operation or that a certain postoperative score is a measure of success. Barrack et al. appeared to be a bit chagrined that their mean score of 172.7 was lower than published norms. They are too modest; on average, the postoperative scores were more than double the preoperative scores. This finding is redolent of success to me.

Finally, I congratulate the authors on maintaining the distinction between an author and a contributor. The nurse who examined all of the patients was acknowledged at the end of the article, but her name was not included in the byline. Some may argue that her participation indeed deserves authorship. Yet, as noted recently by Rennie et al.5, authorship is not a coin with which to pay contributors. Rather, authorship speaks about fundamental contributions to the design, execution, and analysis of the study.

This work by Barrack et al. is an important step toward bringing orthopaedics within the pantheon of so-called evidence-based medicine. This is a major achievement.

Joseph Bernstein, M.S., M.D.: Department of Orthopaedic Surgery, PENN Musculoskeletal Institute, 39th and Market Streets, 1 Cupp Pavilion, Presbyterian Medical Center, Philadelphia, Pennsylvania 19104

Dr. Barrack, Dr. Wolfe, Dr. Waldman, Dr. Milicic, Dr. Bertot, and Dr. Myers reply:

The kind remarks of Dr. Bernstein are greatly appreciated. We understand the importance of statistical power, but we did not think that it was necessarily appropriate in this case. This type of analysis has, in fact, rarely been reported in such clinical studies, although perhaps it should be used more frequently.

We believe that authors frequently use the term statistical power somewhat euphemistically to mean "having adequate data to discern relationships and applying the proper statistical test." However, we assume that Dr. Bernstein is inquiring about power as it is defined statistically.

We would like to point out that power does not quantitate the likelihood of a type-II error (failure to find a difference when one exists). Power is the opposite—the conditional probability of finding a significant difference if one exists. For power to have meaning, one must assume that there is a significant difference to be discerned.

Likewise, the null hypothesis cannot be proved; rather, it can only be accepted or rejected. Thus, we cannot claim that there was no difference between the two treatments, only that there was no clinical or statistical evidence of a difference between the groups. As the standard of non-significance is usually p > 0.15 or, more stringently, p > 0.25, our p values are sufficiently high that, in keeping with current standards, we are reasonably confident that a type-II error has not been made6. For instance, the p values for the overall, pain, and function scores were 0.63, 0.56, and 0.77, respectively. The difference between the groups with regard to the overall score was 3.6 of 200 points, or 1.8 points if normalized to a 100-point scale. Given a standard deviation of several times that number, this is not a situation in which statistical power should be a concern.

Another way to address the question is to look at the minimum difference that we could have found. Given the variability and size of our sample, we could have found a minimum difference of about half of a standard deviation unit. The actual difference between the groups approximated one-tenth of a standard deviation unit. At some point, the realities of performing clinical research must supervene and the results must be accepted as they appear. We believe, given the imprecision of knee scores and the minuscule differences that we found between the groups, that our conclusions are justified.

The relevant means were all included in our article. The p values were also included and were quite high. We used the most powerful statistical test available (analysis of covariance) when possible and confirmed all results with the non-parametric Kruskal-Wallis test.

We agree that problems may be lurking in the future for the patients in our study, but we maintain a priori that both groups have an equal chance of problems. Nevertheless, a longer-term follow-up study is planned to address this question.

We must further correct Dr. Bernstein by pointing out that the article did not state that abrasion chondroplasty was performed. We removed osteophytes and drilled eburnated bone; no articular cartilage was removed from the knees that were not resurfaced.

Indeed, scores do not always reflect the so-called success of a procedure. We are not chagrined regarding our patients' knee scores compared with historical results. On the contrary, we pointed out that a previous study from our institution1 revealed an average knee score of only 180.2 of 200 points for asymptomatic individuals and that our scores, when normalized to that standard, are quite high. We agree with Dr. Bernstein that knee scores remain limited in their ability to reflect patient utility. We therefore included patient-reported measures of performance and satisfaction in order to enable the reader to better judge the outcomes represented.

Robert L. Barrack, M.D.; Michael W. Wolfe, M.D.; Alexander J. Bertot, M.D.: Department of Orthopaedic Surgery, SL-32, Tulane University School of Medicine, 1430 Tulane Avenue, New Orleans, Louisiana 70112

Douglas A. Waldman, M.D.: Orthopaedic Surgery Section, Surgical Service, Veterans Administration Medical Center, 3351 Masonic Drive, Alexandria, Louisiana 71301

Matko Milicic, M.D.: Orthopaedic Surgery Section, Surgical Service, Veterans Administration Medical Center, 1601 Perdido Street, New Orleans, Louisiana 70148

Leann Myers, Ph.D.: Department of Biostatistics and Epidemiology, Tulane University School of Public Health and Tropical Medicine, 1501 Canal Street, New Orleans, Louisiana 70112

References

  1. Brinker, M. R.; Lund, P. J.; and Barrack, R. L.: Demographic biases of scoring instruments for the results of total knee arthroplasty. J. Bone and Joint Surg., 79-A: 858-865, June 1997.[Abstract/Free Full Text]
  2. Buckwalter, J. A.: Were the Hunter brothers wrong? Can surgical treatment repair articular cartilage?. Iowa Orthop. J., 17: 1-13, 1997.[Medline]
  3. Colton, T.: Statistics in Medicine. Boston, Little, Brown, 1974.
  4. Newman, A. P.: Articular cartilage repair. Am. J. Sports Med., 26: 309-324, 1998.[Abstract/Free Full Text]
  5. Rennie, D.; Yank, V.; and Emanuel, L.: When authorship fails. A proposal to make contributors accountable. J. Am. Med. Assn., 278: 579-585, 1997.[Abstract/Free Full Text]
  6. Winer, B. J.; Brown, D. R.; and Michels, K. M.: Statistical Principles in Experimental Design. Ed. 3, p. 380. New York, McGraw-Hill, 1991.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Facebook Facebook   Add to Technorati Technorati   Add to Twitter Twitter    What's this?



This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF) Free
Right arrow Letters to the Editor: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Letters to the Editor are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Rights and Permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernstein, J.
Right arrow Articles by Myers, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bernstein, J.
Right arrow Articles by Myers, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Facebook   Add to Technorati   Add to Twitter  
What's this?