This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Letters to the Editor: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Letters to the Editor are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Rights and Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by WARD, W. T.
Right arrow Articles by FITCH, R. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by WARD, W. T.
Right arrow Articles by FITCH, R. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Facebook   Add to Technorati   Add to Twitter  
What's this?
The Journal of Bone and Joint Surgery 79:656-63 (1997)
© 1997 The Journal of Bone and Joint Surgery, Inc.

Severin Classification System for Evaluation of the Results of Operative Treatment of Congenital Dislocation of the Hip. A Study of Intraobserver and Interobserver Reliability*

W. TIMOTHY WARD, M.D.{dagger}, MOLLY VOGT, PH.D.{dagger}, JAN S. GRUDZIAK, M.D.{dagger}, PITTSBURGH,, YÜCEL TÜMER, M.D.{ddagger}, ANKARA, TURKEY, P. CHRISTOPHER COOK, M.D., F.R.C.S.(C){dagger}, PITTSBURGH, PENNSYLVANIA and ROBERT D. FITCH, M.D.§, DURHAM, NORTH CAROLINA

Investigation performed at the University of Pittsburgh, Pittsburgh, Bayindir Medical Center, Ankara, and Duke University Medical Center, Durham


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The Severin classification system frequently is used to evaluate the radiographic results of operations performed for the treatment of congenital dislocation of the hip. However, the reliability of this classification scheme has not been established, to our knowledge. Ideally, a classification system should be validated before it is used to promote therapeutic guidelines or to compare results of treatment; the purpose of the present study was to establish the intraobserver and interobserver reliability of the Severin classification system. Four blinded raters and the operating surgeon independently used the Severin system to evaluate the most recent radiographs of thirty-seven children (fifty-six hips) who had been managed, an average of nine years previously, with a medial open reduction for congenital dislocation of the hip. Three of the raters evaluated the same radiographs again under similar testing circumstances eight weeks later. Ten paired interobserver and three intraobserver comparisons then were analyzed with use of the Cohen kappa coefficient ({kappa}). The average kappa coefficient for the six pairwise comparisons between the four blinded raters was 0.15 (range, -0.05 to 0.42) when all Severin classes were analyzed independently. The average kappa coefficient for the four pairwise comparisons between the blinded raters and the operating surgeon was even lower (0.02). The kappa coefficients for the three intraobserver comparisons were 0.20, 0.38, and 0.44 (average, 0.34). Kappa analysis demonstrated variable and low levels of agreement when the Severin system was used to rate the results of operations performed for the treatment of congenital dislocation of the hip. We believe that the unadjusted kappa coefficient should indicate excellent agreement ({kappa} > 0.75) for all comparisons if this system is to be used for the evaluation of clinical results. The unacceptably low levels of intraobserver and interobserver reliability call into question the clinical conclusions of reports in which the Severin system has been used as the basis of proof.


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The Severin classification system25 frequently is used to assess the radiographic results of operations performed for the treatment of congenital dislocation of the hip. Despite its widespread acceptance, the reliability of this system has not been established, to our knowledge.

Severin first used this classification system in 1941 to describe the radiographic appearance of the hip after the closed treatment of congenital dislocation25. The system includes six main categories: class I (normal), class II (moderate deformity), class III (dysplasia with no subluxation), class IV (subluxation), class V (subluxation with a pseudoacetabulum), and class VI (redislocation) (Table I). Severin used no quantitative parameters other than the center-edge angle to determine the radiographic classification. Clinicians who use the Severin system appear to have reached a consensus that class I indicates an excellent result; class II, a good result; class III, a fair result; and classes IV, V, and VI, a poor result2-4,13,15. However, no measure of the accuracy or reliability of this system was included in the original investigations by Severin25,26 or has been reported since then, to our knowledge.


View this table:
[in this window]
[in a new window]
 
TABLE I SEVERIN CLASSIFICATION SYSTEM25*

 
Salter, in 1961, used the Severin system to assess both the maintenance of complete reduction and the osseous development of the acetabulum and the femoral head after innominate osteotomy22. In that study, twenty-three of twenty-five hips that had been rated as Severin class IV or V preoperatively were rated as class I or II postoperatively. In a later report on 325 subluxated or dislocated (class-IV or V) hips, Salter and Dubos noted a dramatic improvement in the Severin rating after innominate osteotomy with or without open reduction23. Those two studies, both of which involved the use of the Severin system, were instrumental in the widespread adoption of innominate osteotomy for the treatment of congenital subluxation or dislocation of the hip in children who are eighteen months to six years old.

Since the 1960's, the Severin system also has been used for the evaluation of the results of a number of alternative procedures for the operative treatment of congenital hip disease10,15,24,34,35 as well as for the comparison of results across studies2. Many of the surgeons involved in those studies reported favorably on the utility of the Severin system.

The value of any clinical classification system is only as good as its reliability or validity when it is used by expert clinicians. In order to validate such a system, it is necessary to demonstrate that the classifications are accurate and reproducible. However, as there is no universally accepted standard with which the Severin system can be compared, the accuracy of the ratings cannot be determined. Therefore, the consistency of the ratings among users is of the utmost importance. For example, the reproducibility of the interpretations made by the same observer on separate occasions (intraobserver reliability) or by multiple observers on a single occasion (interobserver reliability) is critical to the intrinsic value of the system. Ideally, the reliability of a classification system should be established before the system is used to promote therapeutic guidelines or to compare the results of alternative treatments. When such a system is shown to have unacceptable reliability, any conclusions drawn from its application may not be valid.

The present study was designed to test both the interobserver and the intraobserver reliability of the Severin system as used by pediatric orthopaedic surgeons.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Thirty-seven children with an average age of eleven months (range, two to twenty-five months) were managed with open reduction through a medial approach for the treatment of congenital dislocation of the hip. Nineteen patients had a bilateral procedure and eighteen, a unilateral procedure; thus, a total of fifty-six hips were treated. The most recent follow-up radiographs (anteroposterior radiographs of the pelvis and frog-leg lateral radiographs of the involved hip or hips), which had been made an average of nine years (range, three to seventeen years) after the initial operation, were used in the present study. All of the operations and follow-up examinations were performed by one surgeon (Y. T.). None of the other surgeons who assessed the radiographs had participated in any aspect of the care of the thirty-seven patients.

The radiographs were labeled sequentially, and all data regarding the identity of the patient and the side of the operation were blocked out. The operating surgeon and four other surgeons each independently rated the fifty-six hips with use of the Severin system; only the operating surgeon knew which hip or hips had been operatively treated. The four blinded raters were fellowship-trained pediatric orthopaedic surgeons; raters 1 and 4 had more than ten years of post-fellowship experience, rater 2 had just completed his fellowship, and rater 3 had more than five years of post-fellowship experience. The operating surgeon (rater 5) was not fellowship-trained but had more than twenty years of experience as a pediatric orthopaedic surgeon and is considered a regional expert on problems related to the hip in children. At the time of writing, all five raters were practicing pediatric orthopaedic surgeons who routinely treated congenital dislocation of the hip.

Seven hips were treated with a femoral or pelvic osteotomy after the initial medial open reduction; evidence of these procedures often could be seen on the radiographs and was potentially obvious to the four blinded raters. However, the blinded raters were not specifically told which seven hips had been treated with another operation.

The operating surgeon (rater 5) rated each of the fifty-six hips, under routine clinical conditions, before the initiation of the study. The other four raters were given a detailed description of the Severin classification scheme, as originally published, and then were asked to rate each hip. (The operating surgeon was not provided with the detailed description.) Because the four blinded raters did not know the ages of the patients, the subdivision of Severin classes according to age-adjusted measurements of the center-edge angle was not requested, and the interpretations were categorized simply as Severin class I, II, III, IV, V, or VI (Table I). Each blinded rater interpreted the radiographs independently and did not know how they had been interpreted by the other raters. The raters were allowed unrestricted time to rate each hip, and a goniometer was available for their use. Six pairwise comparisons were made between the four blinded raters, and four pairwise comparisons were made between the blinded raters and the operating surgeon.

Eight weeks later, raters 1, 2, and 3 evaluated the same set of radiographs again under similar testing circumstances. At the time of the first evaluation, the raters had been told to keep no notes and to make no attempt to memorize the radiographs. At the time of the second evaluation, the radiographs were presented in a different numerical order to guard against any recall bias from the first interpretation.

Statistical Analysis
The Cohen kappa coefficient7 and the weighted kappa statistic8 were calculated to assess the interobserver and intraobserver reliability with use of the Stata software package (version 4.0; Stata, College Station, Texas). Kappa is a measure of the pairwise agreement between observers that reflects the proportion of observed agreement beyond that expected by chance alone6-8,29,36. (Agreement due to chance alone is indicated by a kappa value of zero.) For each of the ten interobserver and three intraobserver comparisons, calculations were made in order to determine the proportion of observed agreement (Po), the proportion of expected agreement (Pe), and the kappa value.

Whereas the kappa coefficient reflects only complete agreement between raters, the weighted kappa statistic allows partial agreement to be taken into account. For example, when two surgeons rate a single radiograph as class II and III (or class II and I), the result is considered as non-agreement for the calculation of the kappa coefficient and as partial or close agreement for the calculation of the weighted kappa statistic.

In addition, a composite multi-rater kappa score11 was calculated for each Severin class. This score provided a weighted measure of all of the paired kappa scores for each Severin class and reflected the over-all agreement among the raters.

The four Severin classes identified in the present study then were combined into two sets of dichotomous groups, and kappa values again were calculated for each of the ten interobserver and the three intraobserver comparisons. In addition, a composite multi-rater kappa score was calculated to provide an over-all assessment of agreement among all of the raters. One set of dichotomous groups consisted of Severin class I and the combination of classes II, III, and IV, and the other set consisted of the combination of classes I and II and the combination of classes III and IV. The analysis of the kappa values according to these dichotomous groups was an attempt to highlight any particular difficulty that the raters may have had in identifying a specific Severin class.

The level of agreement was assessed with use of the system described by Svanholm et al.31, in which a kappa value of 0.50 or less indicates poor agreement, a value of 0.51 to 0.75 indicates moderate agreement, and a value of 0.76 or more indicates excellent agreement. This method is more stringent than the more commonly used system defined by Landis and Koch18, in which a kappa value of 0.21 to 0.40 indicates fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; and 0.81 to 1.0, excellent agreement. The 95 per cent confidence intervals were calculated as kappa ± 1.96 times the standard error.


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 

Interobserver Agreement

Individual Classes
No hip was rated as Severin class V or VI. The distributions among the four remaining Severin classes were very similar for raters 1 and 2, both of whom rated a much higher percentage of hips as classes II and III than as classes I and IV (Table II). In contrast, rater 3 demonstrated a distinct bias toward rating hips as class II at the time of the first evaluation and as class III at the time of the second evaluation. The total number of hips in each class was almost identical for raters 4 and 5, both of whom rated most hips as class I. However, summating the data in this manner does not reflect the true interobserver or intraobserver agreement, as the hips assigned to a given class by one rater may not be the same hips assigned to that class by another rater. Calculation of the kappa coefficient for pairwise agreement, however, does permit direct comparisons of the classifications assigned by the different raters.


View this table:
[in this window]
[in a new window]
 
TABLE II SUMMATED RATINGS FOR EACH RATER*

 
The kappa coefficients for the six pairwise comparisons between the classifications assigned by the four blinded raters at the time of the first interpretation ranged from -0.05 to 0.42, with the average kappa value (0.15) only slightly greater than that due to chance alone (Table III). The pairwise comparisons between raters 1, 2, and 3 demonstrated better agreement than all of the comparisons between any one of these three raters and rater 4. However, none of the comparisons demonstrated a kappa coefficient that indicated even moderate agreement ({kappa} > 0.50). All of the pairwise comparisons between the four blinded raters and the operating surgeon demonstrated poor agreement, as indicated by an average kappa coefficient of 0.02. Even the comparison between rater 4 and the operating surgeon (the two raters who had assigned almost identical ratings) ({kappa} = 0.13) did not demonstrate substantial agreement (Table III). The weighted kappa statistics were either nearly equivalent or slightly greater than the unadjusted kappa coefficients.


View this table:
[in this window]
[in a new window]
 
TABLE III ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASSES25

 
The composite multi-rater kappa scores, which indicate the degree to which all raters agreed that a specific hip belonged to a specific Severin class, indicated poor agreement when all five raters were included in the analysis as well as when rater 5 was excluded (Table IV). The scores were modestly greater within the two classes that indicate a more severe abnormality (classes III and IV).


View this table:
[in this window]
[in a new window]
 
TABLE IV COMPOSITE MULTI-RATER KAPPA SCORES ACCORDING TO INDIVIDUAL SEVERIN CLASSES

 

Dichotomous Groups
Class I and classes II, III, and IV: For the next set of analyses, the four Severin classes were divided into two dichotomous groups: the first group included class I (normal hips), and the second group included classes II, III, and IV (abnormal hips). Only the comparisons between raters 1 and 2 and raters 4 and 5 demonstrated a level of agreement beyond that expected by chance alone (that is, the 95 per cent confidence interval did not include zero) (Table V). Even these comparisons, however, demonstrated only poor or moderate agreement. Furthermore, the multi-rater kappa scores for the four blinded raters as well as for all five raters actually indicated that the level of agreement was less than that expected by chance alone ({kappa} = -0.02 for both groups).


View this table:
[in this window]
[in a new window]
 
TABLE V ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS

 
Classes I and II and classes III and IV: For the final assessment of interobserver agreement, the four classes again were combined into two dichotomous groups: the first group included classes I and II (normal and mildly deformed hips), and the second group included classes III and IV (dysplastic and subluxated hips). The results were similar to those of the previous analysis, except for the comparisons between raters 2 and 3 ({kappa} increased from 0.11 to 0.58), 2 and 4 ({kappa} increased from -0.01 to 0.16), and 4 and 5 ({kappa} decreased from 0.21 to -0.03) (Table V). The composite multi-rater kappa scores indicated poor over-all agreement ({kappa} = 0.20 and 0.29).

Intraobserver Agreement
Intraobserver agreement was assessed by comparing the results of the evaluations performed by raters 1, 2, and 3. When the data were analyzed according to the individual Severin classes, the kappa coefficients indicated poor agreement and the weighted kappa statistics indicated poor or moderate agreement (Table VI). All of the kappa values were significantly greater than zero (p < 0.05).


View this table:
[in this window]
[in a new window]
 
TABLE VI ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASS25

 
Analysis of the data according to the two sets of dichotomous groups demonstrated similar results compared with those for the individual classes (Table VII). The average intraobserver agreement was poor for both sets of dichotomous groups ({kappa} = 0.43 and 0.47). The findings regarding intraobserver agreement mirrored those regarding interobserver agreement.


View this table:
[in this window]
[in a new window]
 
TABLE VII ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS

 


    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The results of the present study suggest that the Severin classification system is not a reliable tool for the evaluation of the radiographic results of operations performed for the treatment of congenital dislocation of the hip. There was wide interobserver variation between the four fellowship-trained pediatric orthopaedic surgeons and the operating surgeon (Table III and Fig. 1). In addition, the three surgeons who performed two evaluations eight weeks apart demonstrated poor or moderate ability to replicate their own radiographic ratings (Table VI). Although intraobserver agreement was somewhat better than interobserver agreement for raters 1, 2, and 3 across the various Severin classes, neither was excellent according to kappa analysis. It was not possible to assess the accuracy of the Severin system because there is no universally accepted standard available for comparative purposes.



View larger version (124K):
[in this window]
[in a new window]
 
Fig. 1 Anteroposterior radiograph, made five years after a bilateral open reduction through a medial approach. The hip on the left side of the radiograph was rated as class II by rater 1, class IV by rater 2, class III by rater 3, and class I by raters 4 and 5. The hip on the right side of the radiograph was rated as class III by rater 1, class IV by rater 2, class II by rater 3, and class I by raters 4 and 5.

 
The wide variation in the ratings assigned by the five raters (Table II) reflected the very strong bias of each rater. Such bias is most likely due to either the inability of the raters to use the classification system correctly or the ambiguities within the system that prevent discrimination between varying postoperative appearances of the hip. The presence of substantial bias was confirmed by low unadjusted kappa coefficients and low composite multi-rater kappa scores. Although the inability to distinguish between Severin classes I and II or between classes III and IV is probably not clinically important, the inability to distinguish between classes II and III is. Whereas most authors have considered a class-II rating to indicate a good radiographic result, a class-III rating has, at best, been considered to indicate a fair radiographic result2-4,13,15.

A number of previous investigators who have used kappa statistics to evaluate the reliability and reproducibility of other orthopaedic classification systems have reported findings that have contradicted commonly accepted orthopaedic teaching. Siebenrock and Gerber28, Sidor et al.27, and Brien et al.5 independently reported that the Neer system for the classification of fractures of the proximal part of the humerus was not reproducible enough to allow for a meaningful comparison of the results of studies in which that system was used. Sidor et al., for example, reported that the mean interobserver reliability of this system was only moderate ({kappa} = 0.41 to 0.60) according to the system of Landis and Koch18. Frandsen et al.12 found poor interobserver agreement when the Gardner system was used to grade fractures of the femoral neck, and Andersen et al.1 found poor interobserver agreement when the Evans system was used to classify intertrochanteric fractures of the hip. Nielsen et al.20 as well as Thomsen et al.32 determined that there was less-than-acceptable agreement ({kappa} < 0.51) when the Lauge-Hansen system was used to classify fractures of the ankle.

One of the more interesting findings of the present study is the generally poor agreement between the operating surgeon and the other four raters. The operating surgeon (rater 5) classified forty-six of the fifty-six hips as normal (class I); although one blinded rater (rater 4) classified forty-four hips as normal, the other three raters classified only one to sixteen hips as normal (Table II). There are several possible explanations for these findings. First, because the four blinded raters knew that their interpretations were to be analyzed, they may have been more critical in their assessment than was the operating surgeon, who determined the ratings in the clinical setting. Such a situation may be expected to result in an increased level of agreement between blinded raters. Second, unintentional bias may have influenced the operating surgeon when he evaluated the results of operations that he had performed; blinded investigators generally are less susceptible to this type of methodological error. Therefore, whenever possible, radiographic results should be assessed by experts who are unaware of the operative treatment and other clinical information related to the patient. However, even when the ratings of the operating surgeon were excluded, the level of interobserver agreement between the blinded raters was unacceptably low.

We attempted to control for a number of uncertainties that may have confounded the results. It has been suggested that the expertise of the raters can affect interobserver agreement17,21. For this reason, only surgeons who had similar training and clinical experience were asked to participate in this study. In addition, care was taken to ensure that the blinded raters did not have access to any data regarding the identity of the patient. A goniometer and the guidelines for the Severin system were available to all of the blinded raters. In addition, all of the blinded raters were given similar instructions, and only the operating surgeon may have been influenced by unintentional bias.

Despite efforts to control for these potential areas of bias, agreement was less than acceptable when comparable experts attempted to apply the Severin system. This finding suggests a fundamental lack of clarity in the descriptive criteria and perhaps points to the need for a modified or newly designed system. The five raters in the present study obviously had different interpretations of what constitutes moderate deformity, dysplasia, and subluxation, and it is not clear what Severin himself meant by these terms. In short, the definitions of the terms in the Severin system are not specific enough to allow raters to achieve a substantial level of agreement.

The single quantitative aspect of the Severin system is the measurement of the center-edge angle. The raters who participated in the present study expressed many concerns about the reliability of this measurement. In some cases, a rater estimated the center-edge angle to be normal but deemed the hip to be dysplastic. The reverse situation—an abnormal angle without obvious dysplasia—also was encountered. A limitation of our study lies in the various ways in which raters measured the center-edge angle. Although a goniometer was available, it was not always used in a standardized manner. Furthermore, raters frequently believed that the measurement of the center-edge angle did not completely reflect the condition of the hip; consequently, the Severin class was determined on the basis of subjective impressions as well as the measurement of the center-edge angle. However, this is how the Severin system usually is used in the clinical setting. We agree with Stulberg and Harris30, who reported that the center-edge angle may not be a useful index of acetabular development because it is affected by many factors.

There is no uniform agreement on the definitions of dysplasia and subluxation. Coleman9 as well as Weinstein33 defined dysplasia as inadequate development of the acetabulum or the femoral head, or both, with an intact Shenton line, and defined subluxation as dysplasia with a broken Shenton line. Many variations of these definitions have been offered14,16,19. Although it is not our intent to present a new, validated rating system to grade the results of operations performed for the treatment of congenital dislocation of the hip, we believe that a more quantitative approach to grading is needed. The Severin classification scheme allows for too much subjectivity, which results in an assessment that is not sufficiently reproducible. Stulberg and Harris30 as well as Murphy et al.19 attempted to be more exact in describing the late appearance of congenital dysplasia of the hip. They discussed the use of several numerical indices, including the center-edge angle, the femoral-head extrusion index, the acetabular index of the weight-bearing zone, the measured amount of lateral and superior subluxation, the peak-to-edge distance, and the angle of the acetabular roof, among others. However, the intraobserver and interobserver agreement for each of these numerical indices must be determined before they are used to grade results of operative treatment of congenital dislocation of the hip.

The present study indicates that the levels of intraobserver and interobserver reliability are unacceptably low when the Severin system is used to classify the radiographic results of operations performed for the treatment of congenital dislocation of the hip. Currently, both the decision to treat congenital hip disease and the evaluation of the postoperative results frequently are based on the Severin classification system.

The findings of the present study will enable clinicians to place into more meaningful perspective the clinical conclusions of investigators who have used the Severin system as the basis of proof. Additional research is needed to construct a more precise system for the classification of the results of operations performed for the treatment of congenital dislocation of the hip.


    Footnotes
 

*No benefits in any form have been received or will be received from a commercial party related directly or indirectly to the subject of this article. No funds were received in support of this study.

{dagger}Children's Hospital of Pittsburgh, 3705 Fifth Avenue at DeSoto Street, Pittsburgh, Pennsylvania 15213.

{ddagger}Bayindir Medical Center, Selanik Cad. 35/3, Kisilay 06650, Ankara, Turkey.

§Duke University Medical Center, Box 2911, Durham, North Carolina 27710.


    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 

  1. Andersen, E.; Jorgensen, L.G.; and Hededam, L. T.: Evans' classification of trochanteric fractures: an assessment of the interobserver and intraobserver reliability. Injury, 21: 377-378, 1990.[Medline]

  2. Barrett, W. P.; Staheli, L. T.; and Chew, D. E.: The effectiveness of the Salter innominate osteotomy in the treatment of congenital dislocation of the hip. J. Bone and Joint Surg., 68-A: 79-87, Jan. 1986.[Abstract/Free Full Text]

  3. Berkeley, M. E.; Dickson, J. H.; Cain, T. E.; and Donovan, M. M.: Surgical therapy for congenital dislocation of the hip in patients who are twelve to thirty-six months old. J. Bone and Joint Surg., 66-A: 412-420, March 1984.[Abstract/Free Full Text]

  4. Blockey, N. J.: Derotation osteotomy in the management of congenital dislocation of the hip. J. Bone and Joint Surg., 66-B(4): 485-490, 1984.

  5. Brien, H.; Noftall, F.; MacMaster, S.; Cummings, T.; Landells, C.; and Rockwood, P.: Neer's classification system: a critical appraisal. J. Trauma, 38: 257-260, 1995.[Medline]

  6. Byrt, T.; Bishop, J.; and Carlin, J. B.: Bias, prevalence and kappa. J. Clin. Epidemiol, 46: 423-429, 1993.[Medline]

  7. Cohen, J. A.: A coefficient of agreement for nominal scales. Educat. and Psychol. Measure, 20: 37-46, 1960.

  8. Cohen, J.: Weighted kappa. Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull., 70: 213-220, 1968.[Medline]

  9. Coleman, S. S.: Congenital Dysplasia and Dislocation of the Hip. St. Louis, C. V. Mosby, 1978.

  10. Faciszewski, T.; Kiefer, G. N.; and Coleman, S. S.: Pemberton osteotomy for residual acetabular dysplasia in children who have congenital dislocation of the hip. J. Bone and Joint Surg., 75-A: 643-649, May 1993.[Abstract/Free Full Text]

  11. Fleiss, J. L. Statistical Methods for Rates and Proportions. Ed. 2, p. 217. New York, Wiley, 1981.

  12. Frandsen, P. A.; Andersen, E.; Madsen, F.; and Skjodt, T.: Garden's classification of femoral neck fractures. An assessment of inter-observer variation. J. Bone and Joint Surg., 70-B(4): 588-590, 1988.

  13. Galpin, R. D.; Roach, J. W.; Wenger, D. R.; Herring, J. A.; and Birch, J. G.: One-stage treatment of congenital dislocation of the hip in older children, including femoral shortening. J. Bone and Joint Surg., 71-A: 734-741, June 1989.[Abstract/Free Full Text]

  14. Ganz, R.; Klaue, K.; Vinh, T. S.; and Mast, J. W.: A new periacetabular osteotomy for the treatment of hip dysplasia. Technique and preliminary results. Clin. Orthop., 232: 26-36, 1988.

  15. Kasser, J. R.; Bowen, J. R.; and |and |MacEwen, G. D.: Varus derotation osteotomy in the treatment of persistent dysplasia in congenital dislocation of the hip. J. Bone and Joint Surg., 67-A: 195-202, Feb. 1985.[Abstract/Free Full Text]

  16. Kim, Y.-H.: Acetabular dysplasia and osteoarthritis developed by an eversion of the acetabular labrum. Clin. Orthop., 215: 289-295, 1987.

  17. Kristiansen, B.; Andersen, U. L.; Olsen, C. A.; and Varmarken, J. E.: The Neer classification of fractures of the proximal humerus. An assessment of interobserver variation. Skel. Radiol., 17: 420-422, 1988.[Medline]

  18. Landis, J. R., and Koch, G. G.: The measurement of observer agreement for categorical data. Biometrics, 33: 159-174, 1977.[Medline]

  19. Murphy, S. B.; Ganz, R.; and Müller, M. E.: The prognosis in untreated dysplasia of the hip. A study of radiographic factors that predict the outcome. J. Bone and Joint Surg., 77-A: 985-989, July 1995.[Abstract/Free Full Text]

  20. Nielsen, J. O.; Dons-Jensen, H.; and Sorensen, H. T.: Lauge-Hansen classification of malleolar fractures. An assessment of the reproducibility in 118 cases. Acta Orthop. Scandinavica, 61: 385-387, 1990.[Medline]

  21. Rasmussen, S.; Madsen, P. V.; and Bennicke, K.: Observer variation in the Lauge-Hansen classification of ankle fractures. Precision improved by instruction. Acta Orthop. Scandinavica, 64: 693-694, 1993.[Medline]

  22. Salter, R. B.: Innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. J. Bone and Joint Surg., 43-B(3): 518-539, 1961.

  23. Salter, R. B., and Dubos, J.-P.: The first fifteen years' personal experience with innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. Clin. Orthop., 98: 72-103, 1974.

  24. Schoenecker, P. L., and Strecker, W. B.: Congenital dislocation of the hip in children. Comparison of the effects of femoral shortening and of skeletal traction in treatment. J. Bone and Joint Surg., 66-A: 21-27, Jan. 1984.[Abstract/Free Full Text]

  25. Severin, E.: Contribution to the knowledge of congenital dislocation of the hip joint. Late results of closed reduction and arthrographic studies of recent cases. Acta Chir. Scandinavica, Supplementum 63, 1941.

  26. Severin, E.: Congenital dislocation of the hip. Development of the joint after closed reduction. J. Bone and Joint Surg., 32-A: 507-518, July 1950.[Free Full Text]

  27. Sidor, M. L.; Zuckerman, J. D.; Lyon, T.; Koval, K.; Cuomo, F.; and Schoenberg, N.: The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J. Bone and Joint Surg., 75-A: 1745-1750, Dec. 1993.[Abstract/Free Full Text]

  28. Siebenrock, K. A, and Gerber, C.: The reproducibility of classification of fractures of the proximal end of the humerus. J. Bone and Joint Surg., 75-A: 1751-1755, Dec. 1993.[Abstract/Free Full Text]

  29. Spitznagel, E. L., and Helzer, J. E.: A proposed solution to the base rate problem in the kappa statistic. . Arch. Gen. Psychiatry, 42: 725-728, 1985.[Abstract/Free Full Text]

  30. Stulberg, S. D., and Harris, W. H.: Acetabular dysplasia and development of osteoarthritis of hip. In The Hip. Proceedings of the Second Open Scientific Meeting of The Hip Society, pp. 82-93. St. Louis, C. V. Mosby, 1974.

  31. Svanholm, H.; Starklint, H.; Gundersen, H. J.; Fabricius, J.; Barlebo, H.; and Olsen, S.: Reproducibility of histomorphologic diagnosis with special reference to the kappa statistic. APMIS, 97: 689-698, 1989.[Medline]

  32. Thomsen, N. O. B.; Overgaard, S.; Olsen, L. H.; Hansen, H.; and Nielsen, S. T.: Observer variation in the radiographic classification of ankle fractures. J. Bone and Joint Surg., 73-B(4): 676-678, 1991.

  33. Weinstein, S. L.: Natural history of congenital hip dislocation (CDH) and hip dysplasia. Clin. Orthop., 225: 62-76, 1987.

  34. Williamson, D. M., and Benson, M. K. D.: Late femoral osteotomy in congenital dislocation of the hip. J. Bone and Joint Surg., 70-B(4): 614-618, 1988.

  35. Zionts, L. E., and MacEwen, G. D.: Treatment of congenital dislocation of the hip in children between the ages of one and three years. J. Bone and Joint Surg., 68-A: 829-846, July 1986.[Abstract/Free Full Text]

  36. Zwick, R.: Another look at interrater agreement. Psychol. Bull., 103: 374-378, 1988.[Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Facebook Facebook   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
JBJSHome page
S. R. Thomas, J. H. Wedge, and R. B. Salter
Outcome at Forty-five Years After Open Reduction and Innominate Osteotomy for Late-Presenting Developmental Dislocation of the Hip
J. Bone Joint Surg. Am., November 1, 2007; 89(11): 2341 - 2350.
[Abstract] [Full Text] [PDF]


Home page
JBJSHome page
N. G. Papadimitriou, A. Papadimitriou, J. E. Christophorides, T. A. Beslikas, and P. K. Panagopoulos
Late-Presenting Developmental Dysplasia of the Hip Treated with the Modified Hoffmann-Daimler Functional Method
J. Bone Joint Surg. Am., June 1, 2007; 89(6): 1258 - 1268.
[Abstract] [Full Text] [PDF]


Home page
JBJSHome page
J. Doornberg, A. Lindenhovius, P. Kloen, C. N. van Dijk, D. Zurakowski, and D. Ring
Two and Three-Dimensional Computed Tomography for the Classification and Management of Distal Humeral Fractures. Evaluation of Reliability and Diagnostic Accuracy
J. Bone Joint Surg. Am., August 1, 2006; 88(8): 1795 - 1801.
[Abstract] [Full Text] [PDF]


Home page
PediatricsHome page
S. A. Shipman, M. Helfand, V. A. Moyer, and B. P. Yawn
Screening for Developmental Dysplasia of the Hip: A Systematic Literature Review for the US Preventive Services Task Force
Pediatrics, March 1, 2006; 117(3): e557 - e576.
[Abstract] [Full Text] [PDF]


Home page
J Am Acad Orthop SurgHome page
D. S. Garbuz, B. A. Masri, J. Esdaile, and C. P. Duncan
Classification Systems in Orthopaedics
J. Am. Acad. Ortho. Surg., July 1, 2002; 10(4): 290 - 297.
[Abstract] [Full Text] [PDF]


Home page
JBJSHome page
C. T. Price, L. G. Lenke, K. H. Bridwell, R. R. Betz, D. H. Clements, J. Harms, T. G. Lowe, H. L. Shufflebarger, R. J. Cummings, E. A. Loveless, et al.
Correspondence
J. Bone Joint Surg. Am., May 1, 1999; 81(5): 743 - 4.
[Full Text]


Home page
JBJSHome page
R. J. CUMMINGS, E. A. LOVELESS, J. CAMPBELL, S. SAMELSON, and J. M. MAZUR
Interobserver Reliability and Intraobserver Reproducibility of the System of King et al. for the Classification of Adolescent Idiopathic Scoliosis
J. Bone Joint Surg. Am., August 1, 1998; 80(8): 1107 - 11.
[Abstract] [Full Text]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Letters to the Editor: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Letters to the Editor are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Rights and Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by WARD, W. T.
Right arrow Articles by FITCH, R. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by WARD, W. T.
Right arrow Articles by FITCH, R. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Facebook   Add to Technorati   Add to Twitter  
What's this?