The Journal of Bone and Joint Surgery 78:1702-6 (1996)
© 1996 The Journal of Bone and Joint Surgery, Inc.
Interobserver Reliability and Intraobserver Reproducibility of the Modified Ficat Classification System of Osteonecrosis of the Femoral Head*
STEPHEN W. SMITH, M.D. ,
RALPH A. MEYER, PH.D. ,
PATRICK M. CONNOR, M.D. ,
STUART E. SMITH, M.D.¶ and
EDWARD N. HANLEY, JR., M.D. , CHARLOTTE, NORTH CAROLINA
Investigation performed at the Department of Orthopaedic Surgery, Carolinas Medical Center, Charlotte
 |
Abstract
|
|---|
Anteroposterior and lateral plain radiographs of 116 osteonecrotic femoral heads were reviewed to assess the interobserver reliability and intraobserver reproducibility of the modified Ficat classification system. The radiographs were reviewed initially and then again six months later by three adult reconstructive surgeons, two general orthopaedic surgeons, two orthopaedic residents, and one musculoskeletal radiologist.
All eight observers agreed on the classification of twenty hips (17 per cent) at both the first and the second review of the radiographs. Paired comparisons revealed a mean interobserver kappa reliability coefficient of 0.46 (range, 0.30 to 0.67) for the first review and 0.45 (range, 0.30 to 0.66) for the second. For all observers, the mean rate of perfect agreement between the first and the second review was 68 per cent (range, 56 to 80 per cent). The mean kappa value for intraobserver reproducibility was 0.59 (range, 0.44 [one of the residents] to 0.73 [one of the general orthopaedic surgeons]). No observer or pair of observers had excellent reproducibility or reliability ( > 0.75).
The poor interobserver reliability and fair intraobserver reproducibility diminishes any meaningful comparison of studies in which the modified Ficat classification system has been used and illuminates the need for a more reliable and reproducible classification system.
 |
Introduction
|
|---|
Osteonecrosis of the femoral head has been most commonly classified according to the system of Ficat4,7,10,11,26 and, more recently, the modified system of Ficat15,23,33,34 (Table I). Many authors have recommended treatment on the basis of the symptoms and the Ficat classification as determined with the use of plain radiographs1,19,23,26,33. Treatment with pulsed electromagnetic fields or core decompression, or both; bone-grafting and decompression; and rotational transtrochanteric osteotomy have been suggested for Ficat stage-I lesions1,3,15,23,25,28,29,32,33. These procedures as well as vascularized fibular grafting, rotational transtrochanteric osteotomy, and intertrochanteric osteotomy have been recommended for stages IIA and IIB1,3,12,18,25,26,28,29,32-34. All of these procedures as well as use of surface replacement and total hip replacement have been recommended to treat stage-III lesions1,3,6,12,16,18,19,25,26,29,34. Pulsed electromagnetic fields, surface replacement, and total hip replacement have been recommended for the treatment of stage-IV lesions3,6,13,16.
The choice of treatment and the judgment of its efficacy often are based directly on the Ficat stage. Determination of the Ficat stage, therefore, has important consequences, as it has a direct effect on the patient's clinical course.
A classification system is a means of description that should be biologically meaningful and be reproducible from one observer to the next as well as by one observer on separate occasions. The absence of reproducibility clouds the comprehension and comparison of studies and treatment recommendations that are based on such a classification system. The purpose of the present study was to assess the degree of interobserver reliability and intraobserver reproducibility of the modified Ficat classification system for osteonecrosis of the femoral head.
 |
Materials and Methods
|
|---|
One hundred and twenty-one symptomatic, untreated, osteonecrotic femoral heads were assessed consecutively with standard anteroposterior and lateral plain radiographs at our clinics between 1988 and 1993. Hips that did not have any radiographic changes had changes evident on magnetic resonance imaging that were consistent with osteonecrosis20. The acceptability of each radiograph was determined by an orthopaedic surgeon who was not an observer in this study. The radiographs of five hips were determined to be unacceptable; therefore, 116 hips were included in the study. All identifying data on the radiographs were obscured, and the radiographs were numbered randomly.
Each radiograph was reviewed and classified by eight observers: three adult orthopaedic reconstructive surgeons, two general orthopaedic surgeons, two fifth-year orthopaedic residents, and one musculoskeletal radiologist. All of the observers were familiar with the Ficat classification system7 and had used it in clinical situations previously.
One week before the radiographic review, each observer was provided with and asked to review a copy of Ficat's description of his classification system7 and each was given information regarding the modified Ficat system. The observers were also provided with a copy of the classification system during testing and were allowed to refer to this as often as necessary. No questions or discussion were allowed during or after testing, and the observers were given as much time as they needed to review each radiograph. After a decision had been made, the radiographs of the next hip were presented until all 116 hips had been classified.
The radiographs were classified by each observer on two separate occasions, six months apart. In the interim, the radiographs were not available to any of the observers and no feedback was provided. The second review was performed in a similar manner, except that the order of the radiographs was reversed.
Statistical Analysis
Computer-assisted statistical analysis (BMDP Statistical Software, PC Version; University of California, Berkeley, California) was used to determine interobserver and intraobserver variability. Kappa values were generated by setting the observed proportion of agreement in relation to the proportion of agreement expected by chance. The kappa coefficients ranged from +1.0 (complete agreement) to 0.0 (chance agreement) to less than 0.0 (less agreement than expected by chance)8.
The guidelines of Svanholm et al. were used for interpretation of the kappa values30. Values of less than 0.50 indicated poor agreement and those of more than 0.75 indicated excellent agreement.
Accuracy, or how close an experimental observation lies to a true value, was impossible to measure because the correct classification for each osteonecrotic hip was not known. We therefore assessed the level of agreement between paired observers (interobserver reliability) and between the reviews of the same observer (intraobserver reproducibility) over time.
 |
Results
|
|---|
Interobserver Reliability
During the first and second reviews of the radiographs, all eight observers agreed on the classification of only twenty hips (17 per cent): eleven were stage IV, one was stage III, two were stage IIA, and six were stage I. Seven of the eight observers agreed on thirty-nine hips (34 per cent) during the first review and on thirty-five hips (30 per cent) during the second. When the level of agreement was lowered to a majority (five of eight observers) level, the observers agreed on eighty-five hips (73 per cent) and eighty-four hips (72 per cent) during the first and second reviews, respectively.
Paired comparisons among the classifications of the eight observers for each radiograph produced twenty-eight paired analyses for each of the two reviews of the radiographs. The mean interobserver reliability coefficient was 0.46 (range, 0.30 to 0.67) for the first review and 0.45 (range, 0.30 to 0.66) for the second. The mean reliability coefficient with respect to the level of expertise was 0.41 for the orthopaedic residents, 0.51 for the adult reconstructive surgeons (who had a mean of 7.7 years of experience), and 0.60 for the general orthopaedic surgeons (who had twenty and fifteen years of experience).
Intraobserver Reproducibility
Assessment of the data for all eight observers revealed an average of thirty-seven instances (32 per cent) in which the classification of a hip at the first review differed from that at the second. Changes were more common when the hip had initially been assigned a middle stage: 31 per cent (nine) of the twenty-nine hips that had been classified initially as stage IIA, eight of the seventeen hips that had been classified as stage IIB (transition), and 39 per cent (nine) of the twenty-three hips that had been classified as stage III were classified differently at the second review. In comparison, two of the eleven hips that had been classified as stage I and 19 per cent (seven) of the thirty-six hips that had been classified as stage IV were classified differently at the second review.
The mean kappa intraobserver reproducibility coefficient was 0.59 (range, 0.44 to 0.73) among the eight observers (Table II). The mean reproducibility was 0.60 (0.66, 0.58, and 0.56) for the adult reconstructive surgeons, 0.49 (0.54 and 0.44) for the orthopaedic residents, and 0.62 (0.73 and 0.55) for the general orthopaedic surgeons. The mean perfect agreement between the first and the second review was 68 per cent (range, 56 to 80 per cent).
Paired comparison of the two observers with the highest reproducibility revealed kappa reliability coefficients of 0.49 and 0.55 during the first and second reviews, respectively.
 |
Discussion
|
|---|
Orthopaedic classification systems provide subdivisions in the spectrum of presentation of certain disease processes. These subdivisions, in turn, may be studied individually with respect to diagnosis, treatment, prognosis, and outcome. For such a classification system to be useful, it must be reproducible among different observers as well as by the same observer on separate occasions. Without these qualities, a precise language on which scientific study and comparison can be based is impossible.
Reported evaluations of orthopaedic classification systems2,5,9,17,21,22,31 have yielded disappointing results, with kappa values ranging from 0.40 to 0.57 for interobserver reliability and from 0.58 to 0.69 for intraobserver reproducibility. Recently, Kay et al. reported a kappa statistic of 0.82 for intraobserver variability and 0.56 for interobserver variability for twenty-five hips that were assessed with use of the Ficat classification system on three occasions by six observers14 (Table III).
View this table:
[in this window]
[in a new window]
|
TABLE III
INTEROBSERVER RELIABILITY AND INTRAOBSERVER REPRODUCIBILITY FOR ORTHOPAEDIC CLASSIFICATION SYSTEMS REPORTED IN THE LITERATURE
|
|
Svanholm et al. stated that "since we are dealing with the minimal requirement that the criteria under study are reproducible, and since the lower limit of the reproducibility defines the upper limit of the significance of the parameter in biological terms, we advocate that 0.50 should be taken as poor, and 0.75 as excellent reproducibility."30 We modified this scheme so that kappa values that were between 0.50 and 0.75 were interpreted as fair. According to these modified guidelines, the present study demonstrated poor interobserver reliability and fair intraobserver reproducibility, with no instances of excellent intraobserver reproducibility or interobserver reliability. In simpler terms, during approximately one of every three reviews, an educated observer classified an osteonecrotic hip differently than a colleague did and, in fact, differently than he himself had classified the hip six months previously.
The observers were least likely to change their classification of hips that they had initially classified as stage I or IV, presumably because these stages represent clear visual images of either a normal hip without changes or a severely osteonecrotic hip with dramatic changes. Hips that were classified initially as stage IIA, stage IIB (transition), or stage III were twice as likely to be classified differently by the same observer than those classified as stage I or IV. While we did not address the reasons for this, the discrepancy may result from various interpretations of the phrases used in Ficat's original description of the middle stages7, such as diffuse sclerosis in the description of stage II; crescentic line, segmental flattening (or out-of-round appearance) in the description of transition; and sequestrum and collapse in the description of stage III.
The spectrum of the osteonecrotic disease process, stratified into rigid classifications, creates inevitable gray areas at the boundaries of each grade. The use of language to describe a radiograph is fraught with imprecision and is subject to the observer's interpretation of both the language and the radiograph. An out-of-round appearance to one observer may be collapse to another. These gray areas and the various interpretations of Ficat's language cause variability in the application of this classification system.
These results, especially with regard to the middle stages of classification, are important and disturbing for several reasons. In the absence of excellent interobserver reliability, it is not sound to compare studies of similarly classified hips from different centers30. Also, in the absence of excellent intraobserver reproducibility, it is not sound to rely on outcome studies that are based on plain radiographs that were assessed over a period of time, before and after treatment, within the same center, or by the same observer or group of observers.
Many methods of treatment for the various stages of osteonecrosis of the femoral head have been described, with variable results1,3,15,16,18,19,23-29,32-34. Widely divergent recommendations for treatment as well as greatly differing prognoses often are based on use of the Ficat classification as it reflects the presence or absence of subchondral, segmental, or advanced collapse of the femoral head. It is reasonable to assume that one cause of the present conflict in treatment recommendations is the different interpretation and use of the Ficat classification by different investigators.
The modified Ficat classification for osteonecrosis of the femoral head does not have acceptable interobserver reliability and intraobserver reproducibility on which to base treatment protocols and determinations of outcome. Magnetic resonance imaging and computed tomography offer a more detailed view of the involvement of the femoral head, subchondral collapse, narrowing of the joint space, and acetabular changes found with the progression of this disease. Future treatment recommendations and outcome analysis may be more reliably and reproducibly based on these imaging techniques.
NOTE: The authors give special thanks to Walter B. Beaver, M.D.; James Coumas, M.D.; Thomas K. Fehring, M.D.; Forney Hutchinson, III, M.D.; and Jeffrey G. Mokris, M.D., for participating in the study. The authors also thank Kim Gravitte for her technical assistance in the statistical analysis of the data.
 |
Footnotes
|
|---|
*No benefits in any form have been received or will be received from a commercial party related directly or indirectly to the subject of this article. No funds were received in support of this study.
Peachtree Orthopaedic Clinic, 2001 Peachtree Road, N. E., Suite 705, Atlanta, Georgia 30309.
Department of Orthopaedic Surgery, Carolinas Medical Center, 1000 Blythe Boulevard, Charlotte, North Carolina 28232.
Miller Orthopaedic Clinic, 1001 Blythe Boulevard, Charlotte, North Carolina 28232.
¶Tennessee Orthopaedic Associates, 301 21st Avenue North, Nashville, Tennessee 37203.
 |
References
|
|---|
-
Aaron, R. K.; Lennox, D.; Bunce, G. E.; and |and |Ebert, T.: The conservative treatment of osteonecrosis of the femoral head. A comparison of core decompression and pulsing electromagnetic fields. Clin. Orthop., 249: 209-218, 1989.
-
Andersen, E.; Jorgensen, L. G.; and |and |Hededam, L. T.: Evans' classification of trochanteric fractures: an assessment of the interobserver and intraobserver reliability. Injury, 21: 377-378, 1990.[Medline]
-
Bassett, C. A. L.; Schink-Ascani, M.; and |and |Lewis, S. M.: Effects of pulsed electromagnetic fields on Steinberg ratings of femoral head osteonecrosis. Clin. Orthop., 246: 172-185, 1989.
-
Camp, J. F., and |and |Colwell, C. W., Jr.: Core decompression of the femoral head for osteonecrosis. J. Bone and Joint Surg., 68-A: 1313-1319, Dec. 1986.[Abstract/Free Full Text]
-
Dias, J. J.; Taylor, M.; Thompson, J.; Brenkel, I. J.; and |and |Gregg, P. J.: Radiographic signs of union of scaphoid fractures. An analysis of inter-observer agreement and reproducibility. J. Bone and Joint Surg., 70-B(2): 299-301, 1988.[Free Full Text]
-
Dutton, R. O.; Amstutz, H. C.; Thomas, B. J.; and |and |Hedley, A. K.: Tharies surface replacement for osteonecrosis of the femoral head. J. Bone and Joint Surg., 64-A: 1225-1237, Oct. 1982.[Free Full Text]
-
Ficat, R. P.: Idiopathic bone necrosis of the femoral head. Early diagnosis and treatment. J. Bone and Joint Surg., 67-B(1): 3-9, 1985.
-
Fleiss, J. L.: Statistical Methods for Rates and Proportions. Ed. 2, p. 217. New York, John Wiley and Sons, 1981.
-
Frandsen, P. A.; Andersen, E.; Madsen, F.; and |and |Skjodt, T.: Garden's classification of femoral neck fractures. An assessment of inter-observer variation. J. Bone and Joint Surg., 70-B(4): 588-590, 1988.
-
Hopson, C. N., and |and |Siverhus, S. W.: Ischemic necrosis of the femoral head. Treatment by core decompression. J. Bone and Joint Surg., 70-A: 1048-1051, Aug. 1988.[Abstract/Free Full Text]
-
Hungerford, D. S., and Lennox, D. W.: Diagnosis and treatment of ischemic necrosis of the femoral head. In Surgery of the Musculoskeletal System, edited by C. McC. Evarts. Ed. 2, vol. 3, pp. 2757-2794. New York, Churchill Livingstone, 1990.
-
Jacobs, M. A.; Hungerford, D. S.; and |and |Krackow, K. A.: Intertrochanteric osteotomy for avascular necrosis of the femoral head. J. Bone and Joint Surg., 71-B(2): 200-204, 1989.
-
Katz, R. L.; Bourne, R. B.; Rorabeck, C. H.; and |and |McGee, H.: Total hip arthroplasty in patients with avascular necrosis of the hip. Follow-up observations on cementless and cemented operations. Clin. Orthop., 281: 145-151, 1992.
-
Kay, R. M.; Lieberman, J. R.; Dorey, F. J.; and |and |Seeger, L. L.: Inter- and intraobserver variation in staging patients with proven avascular necrosis of the hip. Clin. Orthop., 307: 124-129, 1994.
-
Lennox, D. W.; Murrah, R. L.; Ebert, T.; and |and |Carbone, J.: The efficacy and safety of core decompression of the hip as a treatment for osteonecrosis. Complicat. Orthop., 47: 39-42, 47, 1993.
-
Meyers, M. H.: Osteonecrosis of the femoral head. Pathogenesis and long-term results of treatment. Clin. Orthop., 231: 51-61, 1988.
-
Nielsen, J. O.; Dons-Jensen, H.; and |and |Sorensen, H. T.: Lauge-Hansen classification of malleolar fractures. An assessment of the reproducibility in 118 cases. Acta Orthop. Scandinavica, 61: 385-387, 1990.[Medline]
-
Saito, S.; Ohzono, K.; and |and |Ono, K.: Joint-preserving operations for idiopathic avascular necrosis of the femoral head. J. Bone and Joint Surg., 70-B(1): 78-84, 1988.
-
Scher, M. A., and |and |Jakim, I.: Intertrochanteric osteotomy and autogenous bone-grafting for avascular necrosis of the femoral head. J. Bone and Joint Surg., 75-A: 1119-1133, Aug. 1993.[Abstract/Free Full Text]
-
Seiler, J. G., III; Christie, M. J.; and |and |Homra, L.: Correlation of the findings of magnetic resonance imaging with those of bone biopsy in patients who have stage-I or II ischemic necrosis of the femoral head. J. Bone and Joint Surg., 71-A: 28-32, Jan. 1989.[Abstract/Free Full Text]
-
Sidor, M. L.; Zuckerman, J. D.; Lyon, T.; Koval, K.; Cuomo, F.; and |and |Schoenberg, N.: The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J. Bone and Joint Surg., 75-A: 1745-1750, Dec. 1993.[Abstract/Free Full Text]
-
Siebenrock, K. A., and |and |Gerber, C.: The reproducibility of classification of fractures of the proximal end of the humerus. J. Bone and Joint Surg., 75-A: 1751-1755, Dec. 1993.[Abstract/Free Full Text]
-
Smith, S. W.; Fehring, T. K.; Griffin, W. L.; and |and |Beaver, W. B.: Core decompression of the osteonecrotic femoral head. J. Bone and Joint Surg., 77-A: 674-680, May 1995.[Abstract/Free Full Text]
-
Steinberg, M. E.: Management of avascular necrosis of the femoral headan overview. In Instructional Course Lectures, The American Academy of Orthopaedic Surgeons. Vol. 37, pp. 41-50. St. Louis, C. V. Mosby, 1988.
-
Steinberg, M. E.; Brighton, C. T.; Corces, A.; Hayken, G. D.; Steinberg, D. R.; Strafford, B.; Tooze, S. E.; and |and |Fallon, M.: Osteonecrosis of the femoral head. Results of core decompression and grafting with and without electrical stimulation. Clin. Orthop., 249: 199-208, 1989.
-
Stulberg, B. N.; Davis, A. W.; Bauer, T. W.; Levine, M.; and |and |Easley, K.: Osteonecrosis of the femoral head. A prospective randomized treatment protocol. Clin. Orthop., 268: 140-151, 1991.
-
Stulberg, B. N.; Levine, M.; Bauer, T. W.; Belhobek, G. H.; Pflanze, W.; Feiglin, D. H. I.; and |and |Roth, A. I.: Multimodality approach to osteonecrosis of the femoral head. Clin. Orthop., 240: 181-193, 1989.
-
Sugano, N.; Takaoka, K.; Ohzono, K.; Matsui, M.; Saito, M.; and |and |Saito, S.: Rotational osteotomy for non-traumatic avascular necrosis of the femoral head. J. Bone and Joint Surg., 74-B(5): 734-739, Sept. 1992.[Abstract/Free Full Text]
-
Sugioka, Y.; Hotokebuchi, T.; and |and |Tsutsui, H.: Transtrochanteric anterior rotational osteotomy for idiopathic and steroid-induced necrosis of the femoral head. Indications and long-term results. Clin. Orthop., 277: 111-120, 1992.
-
Svanholm, H.; Starklint, H.; Gundersen, H. J. G.; Fabricius, J.; Barlebo, H.; and |and |Olsen, S.: Reproducibility of histomorphologic diagnoses with special reference to the kappa statistic. APMIS, 97: 689-698, 1989.[Medline]
-
Thomsen, N. O. B.; Overgaard, S.; Olsen, L. H.; Hansen, H.; and |and |Nielsen, S. T.: Observer variation in the radiographic classification of ankle fractures. J. Bone and Joint Surg., 73-B(4): 676-678, 1991.
-
Tooke, S. M. T.; Nugent, P. J.; Bassett, L. W.; Nottingham, P.; Mirra, J.; and |and |Jinnah, R.: Results of core decompression for femoral head osteonecrosis. Clin. Orthop., 228: 99-104, 1988.
-
Warner, J. J. P.; Philip, J. H.; Brodsky, G. L.; and |and |Thornhill, T. S.: Studies of nontraumatic osteonecrosis. The role of core decompression in the treatment of nontraumatic osteonecrosis of the femoral head. Clin. Orthop., 225: 104-127, 1987.
-
Yoo, M. C.; Chung, D. W.; and |and |Hahn, C. S.: Free vascularized fibula grafting for the treatment of osteonecrosis of the femoral head. Clin. Orthop., 277: 128-138, 1992.

CiteULike Connotea Del.icio.us Facebook Technorati Twitter What's this?
This article has been cited by other articles:

|
 |

|
 |
 
M. A. Mont, G. A. Marulanda, L. C. Jones, K. J. Saleh, N. Gordon, D. S. Hungerford, and M. E. Steinberg
Systematic Analysis of Classification Systems for Osteonecrosis of the Femoral Head
J. Bone Joint Surg. Am.,
November 1, 2006;
88(suppl_3):
16 - 26.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. J. Karimova, S. N. Rai, X. Deng, D. J. Ingle, A. C. Ralph, M. D. Neel, and S. C. Kaste
MRI of Knee Osteonecrosis in Children with Leukemia and Lymphoma: Part 1, Observer Agreement
Am. J. Roentgenol.,
February 1, 2006;
186(2):
470 - 476.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. E. Beaule, F. J. Dorey, and J. M. Matta
Letournel Classification for Acetabular Fractures: Assessment of Interobserver and Intraobserver Reliability
J. Bone Joint Surg. Am.,
September 1, 2003;
85(9):
1704 - 1709.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. M. Longstaff, R. Jain, and E. H. Schemitsch
Fixation of Subcapital Hip Fractures in Patients Sixty Years of Age or Less
J. Bone Joint Surg. Am.,
August 1, 2003;
85(8):
1616 - 1617.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Levy, S. Lintner, K. Kenter, and K. P. Speer
Intra- and Interobserver Reproducibility of the Shoulder Laxity Examination
Am. J. Sports Med.,
July 1, 1999;
27(4):
460 - 463.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. J. BOZIC, D. ZURAKOWSKI, and T. S. THORNHILL
Survivorship Analysis of Hips Treated with Core Decompression for Nontraumatic Osteonecrosis of the Femoral Head
J. Bone Joint Surg. Am.,
February 1, 1999;
81(2):
200 - 9.
[Abstract]
[Full Text]
|
 |
|

|
 |

|
 |
 
D. S. Hungerford, M. A. Mont, H. E. Jergesen, and A. S. Khan
Correspondence
J. Bone Joint Surg. Am.,
May 1, 1998;
80(5):
765 - 6.
[Full Text]
|
 |
|
|