6,150
Views
180
CrossRef citations to date
0
Altmetric
Original Articles

Patient-reported outcome measures in arthroplasty registries

Report of the Patient-Reported Outcome Measures Working Group of the International Society of Arthroplasty Registries
Part II. Recommendations for selection, administration, and analysis

, , , , , , , , , , , & show all
Pages 9-23 | Received 15 Aug 2015, Accepted 17 Sep 2015, Published online: 26 May 2016

Abstract

Abstract — The International Society of Arthroplasty Registries (ISAR) Patient-Reported Outcome Measures (PROMs) Working Group have evaluated and recommended best practices in the selection, administration, and interpretation of PROMs for hip and knee arthroplasty registries. The 2 generic PROMs in common use are the Short Form health surveys (SF-36 or SF-12) and EuroQol 5-dimension (EQ-5D). The Working Group recommends that registries should choose specific PROMs that have been appropriately developed with good measurement properties for arthroplasty patients. The Working Group recommend the use of a 1-item pain question (“During the past 4 weeks, how would you describe the pain you usually have in your [right/left] [hip/knee]?”; response: none, very mild, mild, moderate, or severe) and a single-item satisfaction outcome (“How satisfied are you with your [right/left] [hip/knee] replacement?”; response: very unsatisfied, dissatisfied, neutral, satisfied, or very satisfied). Survey logistics include patient instructions, paper- and electronic-based data collection, reminders for follow-up, centralized as opposed to hospital-based follow-up, sample size, patient- or joint-specific evaluation, collection intervals, frequency of response, missing values, and factors in establishing a PROMs registry program. The Working Group recommends including age, sex, diagnosis at joint, general health status preoperatively, and joint pain and function score in case-mix adjustment models. Interpretation and statistical analysis should consider the absolute level of pain, function, and general health status as well as improvement, missing data, approaches to analysis and case-mix adjustment, minimal clinically important difference, and minimal detectable change. The Working Group recommends data collection immediately before and 1 year after surgery, a threshold of 60% for acceptable frequency of response, documentation of non-responders, and documentation of incomplete or missing data.

The International Society of Arthroplasty Registries (ISAR) Patient-Reported Outcome Measures (PROMs) Working Group was established to convene, evaluate, and advise on best practices in the selection, administration, and interpretation of PROMs for hip and knee arthroplasty in registries worldwide. This report reviews the Working Group recommendations for selection, administration, and analysis of PROMs by arthroplasty registries.

Lessons learned from patient-reported outcome measures in arthroplasty registries

Patient-centered outcomes reflect the current change of focus from volume-based to value-based healthcare delivery. Measuring patient-centered outcomes has provided important information about outcomes that matter to patients. These outcomes confirm that joint replacement is an effective treatment for disabling hip or knee joint diseases. In addition, these outcomes provide information about an important minority of patients who do not improve as expected or express dissatisfaction with the results of the intervention (Baker et al. Citation2007, Rolfson et al. Citation2011a).

Data from arthroplasty registries have identified several patient-related determinants of pain relief, functional outcomes, and satisfaction such as emotional health (Franklin et al. Citation2008, Rolfson et al. Citation2009), presence of coexisting low back pain, and other medical conditions that impair ambulation (Ayers et al. Citation2013, Gordon et al. Citation2014b), socioeconomic status (Greene et al. Citation2014a), and obesity (Jameson et al. Citation2014b). This information is not intended to give reasons to exclude patients from having joint replacement. However, these variables should be considered in the shared decision-making process to inform patients and give them realistic expectations.

The existing PROMs programs have provided knowledge about the association between surgical factors such as surgical approach and patient-reported outcomes (Jameson et al. Citation2014a, Lindgren et al. Citation2014). Furthermore, registry data have shown that low function after hip replacement is a risk factor for having a subsequent revision (Devane et al. Citation2013). However, considering the large amount of literature about PROMs in joint replacement from registries and clinical studies, it is important to recognize that little published information is available on how PROMs may be used in clinical practice to improve results.

Patient-reported outcome measures in hip and knee replacement

Generic patient-reported outcome measures

There are 2 primary surveys used in joint replacement registries to measure health-related quality of life (HRQoL): the Short Form health surveys (Optum Inc. Citation2015) and the EuroQol health outcome measures (EuroQol Group Citation2015) (). The Short Form 36 health survey (SF-36) includes 8 dimensions of health that are summarized into 1 physical and 1 mental scale component (Ware et al. Citation1992). The SF-36 is the most commonly used generic PROM in clinical trials, and is psychometrically sound for patients who have osteoarthritis (Kosinski et al. Citation1999). However, for routine follow-up in joint replacement registries, the SF-12, which is a shortened version of the SF-36, has often been preferred (Ware et al. Citation1996, Rolfson et al. Citation2011b). Although Short Form tools require licensing, the equivalent Veterans Rand 12-item survey (VR-12) and 36-item survey (VR-36) are available for non-commercial use without charge (Boston University School of Public Health Citation2015).

Table 1. Generic and specific patient-reported outcome measures commonly used in hip and knee arthroplastyTable Footnotea

The EuroQol 5-dimension (EQ-5D) is a generic measure of health status developed by the EuroQol Group (EuroQol Group Citation1990). The survey includes 5 health outcome domains or items that can be summarized into a utility score, and an accompanying visual analog scale (VAS) that addresses current health state (EQ VAS). There are several different value sets to calculate the utility scores, and each value set represents the preferences of the population from which it was derived. Thus, comparisons of results using utility indices calculated with different value sets may be difficult (Gordon et al. Citation2013). Differences in methods for establishment of value sets may explain differences between values sets more than differences in preferences of the populations in which they were developed.

The EQ-5D-5L is an extended version of the EQ-5D that has 5 response options for each dimension (Herdman et al. Citation2011). The original EQ-5D, which has 3 levels of response options (EQ-5D-3L), is most commonly used and has been validated for patients with osteoarthritis (Barton et al. Citation2009, Obradovic et al. Citation2013). However, the EQ-5D-5L has better psychometric properties (such as better responsiveness and lower ceiling effects) than the EQ-5D-3L, and increased use of the EQ-5D-5L is anticipated in clinical studies and registries (Greene et al. Citation2014b).

Both Short Form and EuroQol tools are commonly used, and there are no strong advantages of one tool over the other. Thus, the ISAR PROMs Working Group does not make specific recommendations about the preferred generic PROMs tool. A crosswalk algorithm is available to convert SF-12 responses to EQ-5D index scores, which may enable comparisons between the tools (Le Citation2014).

Specific patient-reported outcome measures

Numerous comprehensive specific PROMs instruments are available for patients who have hip or knee problems (). These measures typically encompass several outcome domains. The Western Ontario and McMaster Universities Arthritis Index (WOMAC) assesses pain, disability, and joint stiffness in patients who have hip or knee osteoarthritis (Bellamy et al. Citation1988). This 24-item questionnaire, with >90 translations available, has been proven to be valid, reliable, and responsive to osteoarthritis outcomes and is commonly used in clinical trials (Bellamy et al. Citation1988, McConnell et al. Citation2001). A 12-item version including all 5 pain questions of the original full questionnaire and 7 of the disability questions has also been developed and validated in a relevant osteoarthritis population (Whitehouse et al. Citation2003). The WOMAC requires licensing, and is used in some large local and regional registries in the USA.

The non-proprietary Knee injury and Osteoarthritis Outcome Score (KOOS) and Hip disability and Osteoarthritis Outcome Score (HOOS) were developed as extensions and comprehensive alternatives to the WOMAC (Roos et al. Citation1998, Nilsdotter et al. Citation2003). The KOOS and HOOS questionnaires are being used in some United States and European registries. However, they may be burdensome for respondents because of the large number of questions (KOOS, 42 questions; HOOS, 40 questions). Thus, the KOOS and HOOS may not be appropriate for use in routine follow-up programs.

The KOOS Physical Function Short Form (KOOS-PS, 7 questions) and HOOS Physical Function Short Form (HOOS-PS, 5 questions) are short versions of the KOOS and HOOS that include physical function items but not pain items (Davis et al. Citation2008, Perruccio et al. Citation2008). Despite their brevity, the KOOS-PS and HOOS-PS have good measurement properties for physical functioning in patients with hip and knee osteoarthritis. These surveys do not require licensing and have been translated into many languages (total: KOOS-PS, 15 languages; HOOS-PS, 10 languages). Although the KOOS-PS and the HOOS-PS are not established measures in joint replacement registries, these short questionnaires have been included in the International Consortium for Health Outcomes Measurement (ICHOM) hip and knee osteoarthritis standard set of outcomes measures (International Consortium for Health Outcomes Measurement Citation2015).

The commonly used joint-specific, 12-item, Oxford Knee Score (OKS) and Oxford Hip Score (OHS) address pain and function, and were designed from interviews with patients to measure outcomes after knee or hip joint replacement surgery (Dawson et al. Citation1996a, Citation1998, Murray et al. Citation2007). These surveys are available in several languages (OKS, 20 languages; OHS, 10 languages) and are useful in clinical studies and joint replacement registries (Browne et al. Citation2013, Devane et al. Citation2013). The OKS and OHS require licensing, are available without charge for non-commercial use, and are used in registry PROMs programs in the UK and New Zealand.

Although some PROMs instruments for hip or knee problems include pain measures, joint pain is often measured with a VAS or numeric rating scale (NRS) in isolation. Despite differences in granularity, possible modes of collection, and presentation requirements, VAS and NRS have an acceptable correspondence (Breivik et al. Citation2000, Hawker et al. Citation2011, Hjermstad et al. Citation2011). Manual readings of VAS scores on pen-and-paper questionnaires require more work than the NRS, and are more susceptible to transcription errors. With VAS surveys that require the respondent to mark a point on a line, it may be difficult to standardize the length of the line for different modes of data collection, especially when some patients complete the survey eletronically and other patients use a hard-copy questionnaire. Furthermore, some patients may have conceptual difficulty in completing the VAS (Wewers et al. Citation1990).

There is variation in the specific PROMs instruments that are used in joint replacement registries. Thus, the ISAR PROMs Working Group does not make specific recommendations about which PROMs to use in arthroplasty registries. However, the Working Group recommends that registries should choose PROMs instruments that (1) were appropriately developed with a relevant patient population, and (2) have evidence of good measurement properties for patients who have arthroplasty.

Pain is considered the major construct of interest. So, for the purpose of harmonization, the Working Group agreed about recommending a 1-item pain question. The suggested measure was slightly modified from the first item of the OHS and OKS (“During the past 4 weeks, how would you describe the pain you usually have in your [right/left] [hip/knee]?”), which has 5 Likert-graded response options (none, very mild, mild, moderate, and severe). The original question is available in many languages and may be easily translated into other languages. The question has good measurement properties, validity, and reliability (Dawson et al. Citation1996b). It is simple to enter data from pen-and-paper questionnaires with responses to this question, instead of the difficulty in reading a VAS, and the question enables presentation on different platforms.

Single-item satisfaction outcome measure

A single question to assess the outcome of hip or knee replacement is attractive because it is simple, easy to use, and may capture the essentials of outcomes after hip and knee arthroplasty. Although the goals of surgery include pain relief and functional improvement, the importance of these goals may vary because each patient may have specific personal expectations from surgery (Mancuso et al. Citation1997, Citation2001). A single item such as “satisfaction with the result of surgery” is conceptually appealing, has good face validity, and enables patients to express their unique points of view about the outcome of surgery.

Although it is well established that there is a correlation between satisfaction and objective improvements in pain, function, and general HRQoL (Robertsson et al. Citation2000, Mahomed et al. Citation2011, Judge et al. Citation2012a, Baker et al. Citation2013, Clement et al. Citation2013), satisfaction with the outcome of joint replacement may be affected by preoperative patient characteristics such as expectations (Mancuso et al. Citation1997, Noble et al. Citation2006, Hamilton et al. Citation2013), function (Hamilton et al. Citation2013), depression (Brander et al. Citation2007), and catastrophizing (Riddle et al. Citation2010). The occurrence of a postoperative complication may negatively affect overall satisfaction with surgery (Kay et al. Citation1983, Bourne et al. Citation2010). Postoperative satisfaction may therefore incorporate multiple and varied factors that affect the outcome of surgery.

There are drawbacks to using a single item to assess outcome. A single qualitative item may not measure the magnitude of change, which can be accomplished only by comparing pre- and postoperative measures of health status or by using a potentially less reliable health transition question. The response to a single qualitative item may also be affected by age, sex, and diagnosis, and it may be necessary to control for these factors for comparisons across regions (Kay et al. Citation1983, Robertsson et al. Citation2000, Merle-Vincent et al. Citation2011). Furthermore, it is not possible to perform cost-effectiveness analysis using a satisfaction question. Nevertheless, the benefit of using a single question may include a significantly higher frequency of response than with more comprehensive questionnaires (Robertsson et al. Citation2001). Published satisfaction frequencies are available from large samples (Robertsson et al. Citation2000, Garellick et al. Citation2009, Citation2010), but the results may be affected by the different ways in which the question is asked and the timing relative to the date of surgery.

When implementing a single-item satisfaction measure, the ISAR PROMs Working Group recommends the wording: “How satisfied are you with the results of your [right/left] [hip/knee] replacement?” When this question is asked less than 1 year after surgery, the qualification “so far” should be added, because clinically relevant improvement may occur between 6 and 12 months after surgery (Browne et al. Citation2013). The preferred response scale may include a range of options (very dissatisfied, dissatisfied, neutral, satisfied, and very satisfied) instead of a VAS to provide a clearer interpretation of the responses. The option “neutral” may be selected by patients who are uncertain or do not want to appear negative in their assessment; so, when reporting the overall proportion of patients who are satisfied, we recommend categorization of results from patients who respond “satisfied” or “very satisfied” as satisfied and other response options as dissatisfied.

Survey logistics

Instructions for patients

For data collection and optimal frequency of response, it is necessary to inform the patient about the needs and goals of PROMs. The patient should be given instructions or a patient guide. Motivation of all orthopedic specialists is needed for the success of PROMs collection. Ideally, orthopedic specialists should inform their patients about the PROMs collection before and after joint replacement. However, the logistics in PROMs collection should not depend on orthopedic specialist engagement in the collection of PROMs. Giving information and instructions, and collecting surveys in the clinic should preferably be done by staff members specifically assigned to the task.

Paper-based and electronic-based data collection

The PROMs may be collected and scored on paper and/or electronically. Paper-based methods of collecting PROMs data may be simple, but it is important to maintain consistency in presenting the PROMs questionnaires in the same order for every collection. After the forms are completed, they should be reviewed for completeness or missing data. When the paper questionnaire is completed in the clinic, the staff members should review the questionnaire for missing data before the patient’s departure to maximize completed items. When the paper questionnaire is mailed to the hospital, it is difficult to ensure completeness of the questionnaire, and missing items must be accepted unless there is a mechanism to contact the patient for the missing items.

After the paper survey has been completed, the data should be entered in an electronic form. Manual data entry may be susceptible to human error, but electronic rules can be constructed to allow only plausible responses. Double data entry may reduce error but it is time-consuming. Alternatively, data may be entered using a specialized scanner and software, which may have disadvantages of cost, the need for training of staff members, and the limitation that some questionnaires still may require manual data entry.

With paper PROMs forms, personnel must apply identifiers and sort, distribute, and collect the PROMs. It may be simple to collect paper forms during routinely scheduled visits to the clinic, but data collection with paper forms between clinical visits may be difficult logistically. Furthermore, bias may be introduced because patients who have problems may be more likely to return for clinic visits and may return to the clinic more frequently (Dawson et al. Citation2010). Thus, it is recommended that follow-up questionnaires should be sent by post as close as possible to the due data.

Electronic versions of PROMs may be difficult to implement because additional instructions may be needed for the electronic format. The wording, punctuation, and response options of the PROMs instrument should be reproduced accurately in the electronic version. The electronic survey must allow patients to go back and change previously answered questions, as with the paper version. Electronic PROMs may be collected on various hardware platforms including desktop computers, laptop computers, tablets, or handheld devices using web-enabled survey tools. For electronic collection using tablets or handheld devices in the clinic, patients can be given the device with the PROMs instrument selected upon arrival, and they can select their answer to each item as presented. Alternatively, access to web-enabled survey tools can be provided at confidential computer stations in clinic waiting rooms or through links that are e-mailed to a patient’s own device.

Electronic data collection may be more efficient and less time-consuming than paper collection because electronic versions avoid the need for manual data entry and they enable immediate scoring. However, electronic PROMs may lack equivalence with paper PROMs because electronic PROMs require access to a digital system and may not be acceptable to all patients. Furthermore, in contrast to paper PROMs, electronic PROMs are often configured to require responses to all items to complete the questionnaire, so patients must provide a response to each item before they can continue to the next item.

Some registries provide the option of either paper- or electronic-based PROMs. In the past, PROMs were measured only with paper questionnaires. More recently, institutions and registries have used electronic surveys. Patients are becoming more familiar with digital platforms than previously, but digital methods are not currently acceptable to or feasible with all patients. Advantages of using both methods may include an increase in patient compliance and a decrease in missing or incomplete data. Although disadvantages may include the increased administrative challenges of maintaining both paper- and electronic-based methods, the ISAR PROMs Working Group recommends providing patients the option of paper or electronic PROMs (Rolfson et al. Citation2011c, Gliklich et al. Citation2014). However, the Working Group acknowledges that this approach may not be feasible for logistic and economic reasons in emerging registries.

Given the rapidly transitioning technology environment (smartphones are becoming ubiquitous in younger populations and more than 70% of baby-boomers (those born in the latter half of the 1940s) use a laptop or desktop computer on a daily basis (MArketingCharts, Citation2015)). As baby-boomers are now reaching their late 60s, the need for paper administration will diminish with time.

Reminders for follow-up

Automatic reminders may improve efficiency and timing of follow-up. Reminders may be created by connecting PROMs instruments with clinical data such as date of surgery. For paper-based PROMs, the reminder could be sent by hospital personnel. With some systems, electronic-based PROMs questionnaires may be sent automatically to patients. It is recommended that a reminder to non-responders be sent 2 weeks after sending the initial questionnaire. In addition, it is recommended that automatic blocking steps should be programmed in to avoid inappropriately sending follow-up PROMs instruments to patients who have withdrawn their informed consent or have died.

Centralized systems or hospital-based follow-up?

Follow-up PROMs may be administered by a centralized system such as an arthroplasty registry or national health service department. With centralized systems, paper or electronic questionnaires are sent to patients and data entry is managed at a central office. Alternatively, hospitals may assume responsibility for all aspects of data collection. With hospital-based systems, members of the hospital staff send the questionnaires to patients, collect data at clinic visits, and perform data entry. The advantages of a centralized system include a uniform method of communication with patients about PROMs. However, hospital-based systems have the advantage of the relationship with the patient, and some patients may be more willing to respond to a questionnaire from the local hospital or the surgeon’s office than one from a centralized agency. Legislation and funding may determine the organization of PROMs collection in different countries.

All patients or a select sample?

Depending on the goals of the PROMs assessment and available resources, the PROMs may be used to measure the entire patient population or a select sample. When PROMs are used for quality improvement, all patients having arthroplasty should be assessed, and a sample of the population from a hospital may be inadequate for analysis. In contrast, a sample of patients may be adequate for research purposes when it is a random, unbiased sample that has a sufficient number of patients for analysis and drawing of conclusions.

Evaluation of patients or joints?

Joint registries would usually organize data collection by primary intervention, joint, and laterality, and each joint that had a primary intervention would yield a new case. This joint-based approach may be suitable for implant surveillance and procedure-related outcomes studies, and most registry PROMs programs have used this approach. However, the approach may have disadvantages for evaluation of PROMs in patients who have multiple arthroplasties, because evaluation of each joint separately may cause too heavy a load in terms of questionnaires. In addition, the use of PROMs is expanding in disciplines other than hip and knee replacement, such as oncology and psychiatry, and this may also cause an increased questionnaire load for some patients.

The alternative is to measure outcomes with a condition- and patient-centered approach, without focusing on a specific joint. The Function and Outcomes Research for Comparative Effectiveness in Total Joint Replacement (FORCE-TJR) Registry measures pain for both hips and knees at all follow-up visits and focuses the specific PROMs on the most problematic joint. This issue must be considered in PROMs collection by registries. A condition- and patient-centered approach, as opposed to joint-centered, may enable longitudinal measurements in integrated care pathways that include early stages of hip and knee disorders and primary care treatment.

Collection intervals

For preoperative PROMs, the ISAR PROMs Working Group recommends collecting PROMs within 3–4 weeks before surgery and not on the day of surgery. Recommendations for postoperative PROMs collection vary between hip and knee arthroplasty. For hip replacement patients, 1 year after surgery is the optimum follow–up time, but it may be better to measure knee replacement patients at 18 months to determine full capacity of recovery. However, for good comparability between arthroplasty registries worldwide, the Working Group recommends follow-up measures at 1 year after surgery for both hip and knee arthroplasty.

Frequency of response

There is a wide variation in frequency of response in PROMs collection between joint registries. The ISAR PROMs Working Group therefore attempted to advise on a desirable minimum frequency of response. There are no general guidelines available from research papers or journals about an acceptable frequency of response in reporting PROMs data. The instructions for authors of the Journal of the American Medical Association (Citation2015) gives a requirement of ≥60% as a sufficient frequency of response. According to ISAR bylaws, full ISAR membership requires ≥80% completeness of procedures (International Society of Arthroplasty Registries Citation2015), a threshold that was selected arbitrarily and that reflected the bias associated with missing data. Nevertheless, the ISAR PROMs Working Group proposes a 60% threshold for an acceptable frequency of response because of external difficulties in capturing PROMs that may be unrelated to survey logistics. Registries that collect detailed demographic data may use imputation models to compute data for non-responders.

Definitions of frequency of response

Frequency of response is usually calculated annually, and the calculation may depend on whether PROMs are captured for all procedures or for a sample. For a given measurement interval in all patients, the total number of questionnaires returned or data entries, each matched to a registered procedure, is divided by the total number of procedures registered during the corresponding year:

For follow-up responses, the number of responses returned is divided by the difference between the total number of procedures registered during the corresponding year minus the number of joints in patients who died before follow-up (FU):

The number of procedures should be adjusted to the number intended for inclusion in the program. For logistic reasons, most registries do not collect preoperative PROMs for non-elective operations; therefore, non-elective procedures are usually excluded from calculations of response frequency.

Characterization of non-responders is recommended to ensure that data are not invalidated because of non-response bias. Furthermore, some patients may not respond before or after surgery, and the frequency of patients who have complete pre- and postoperative data in final analyses may be lower than the frequency of respondents before or after surgery. It is therefore important to maximize patient response both before and after surgery.

Reporting of missing values

The proportion of entries with incomplete or missing values should be reported per item. Measure-specific guidelines often provide guidance about how to address missing items in summary scores. Conditional mean single imputation is a common method to address missing items, but there may be other and better methods for imputation. Instrument-specific guidelines are commonly available on how to address answers that have misplaced marks or ≥2 options marked for the same item. These inaccuracies should be addressed according to measure-specific guidelines.

Establishment of a patient-reported outcome measures registry program

Establishing a PROMs program is a major challenge. It is helpful to establish a project team that includes a dedicated project leader, members of the registry management or executive committee, orthopedic surgeons who have large networks in the country or region, registry coordinators, and PROMs researchers.

Several established PROMs programs began as a small test project. It may be helpful to launch a small test project with a few providers to test feasibility and learn from early experiences before the full-scale registry program is established. The ISAR PROMs Working Group identified the importance of obtaining a critical minimum number of participating orthopedic surgeons or hospitals (representing about 10% of all arthroplasties performed) as a key factor for success in implementation. The numerous challenges in managing local logistics may require the support of an influential colleague locally. Hospital and clinic participation may be facilitated by early strategic registry staff visits to the site to inform, encourage, and assist at the start of PROMs collection. After the first participant hospitals show successful operational data collection, other hospitals may join. Some registries that have high response frequencies inform clinical sites about successful strategies to integrate capture of PROMs with preoperative clinic flow and about optimal postoperative direct-to-patient capture of PROMs.

We recommend minimization of the number of items included in a PROMs program, but the set of questions should reflect the essential constructs sought (primarily pain and physical function) and should be meaningful to patients. The ISAR PROMs Working Group recommends selection of 1 generic and 1 specific PROMs instrument, and it proposes that all PROMs programs should use 2 single-item measures (1 measure for joint pain and 1 measure for satisfaction with the results of surgery) to harmonize the measures.

Risk factors and case-mix adjustment

Case-mix variables

Case-mix adjustment is required to limit unjust comparisons of PROMs after joint replacement. Case-mix adjustment may facilitate comparison of outcomes between centers, providers, or countries by minimizing variability due to patient characteristics. However, the collection of case-mix variables in addition to PROMs may increase the burden on patients and healthcare providers. Thus, the choice of case-mix variables must be carefully balanced to obtain adequate frequency of response.

It may be difficult to determine whether—and how extensively or precisely—to include factors in case-mix adjustment models, such as diagnosis, previous interventions, body habitus, comorbidities, socioeconomic status, ethnicity, physical function, and personality traits. Both adjusted and unadjusted PROMs results should be reported, regardless of the case-mix variables used, to ensure transparency and comprehensive interpretation.

Variables to consider in case-mix adjustment

Well-designed randomized controlled trials attempt to minimize systematic differences between groups through randomization. Observational cohorts do not have this luxury. In an observational study, we need to minimize bias but accept that some bias may remain. A comprehensive set of confounding variables has to be defined, but it is most likely impossible to collect and adjust for all confounding variables. Numerous preoperative patient and clinical factors are associated with variation in postoperative pain, physical function, general health, and patient satisfaction (). Older age and female sex are associated with poorer postoperative function after total joint replacement (Jones et al. Citation2001, Franklin et al. Citation2008, Rolfson et al. Citation2011a, Judge et al. Citation2012b, Williams et al. Citation2013, Gordon et al. Citation2014a). Although higher body mass index may be associated with lower improvement in pain and function (Lübbeke et al. Citation2007, Franklin et al. Citation2008, Jameson et al. Citation2014b), the effect of it on outcome is small (Lübbeke et al. Citation2007, Jameson et al. Citation2014b), and body mass index may not be a meaningful predictor of outcome after total joint replacement (Judge et al. Citation2012b, Citation2014).

Table 2. Case-mix variables in arthroplasty registriesTable Footnotea

Preoperative status is a strong predictor of pain and function after total joint replacement (Franklin et al. Citation2008). Patients who have more pain or poor function before surgery have a greater likelihood of greater postoperative gains (Franklin et al. Citation2008, Rolfson et al. Citation2009, Judge et al. Citation2012b, Greene et al. Citation2015). The ISAR PROMs Working Group recommends that registries that collect postoperative PROMs should also collect preoperative measures of pain and function to risk-adjust adequately when comparing outcomes between clinical groups. In addition, collection of pre- and postoperative PROMs may enable calculation of the degree of change or improvement after surgery.

Measures of socioeconomic status such as income, work status, and education affect postoperative PROMs () (Greene et al. Citation2014a, Neuburger et al. Citation2012, Citation2013). However, these measures are recorded inconsistently in registries worldwide. Currently, United States Medicare is debating the use of socioeconomic status in case-mix adjustment.

The primary diagnosis or cause of knee or hip pain and disability varies between joint replacement patients, and diagnosis affects outcome (Mourão et al. Citation2009, Liao et al. Citation2015, Schrama et al. Citation2015). Although primary knee and hip osteoarthritis are the most common indications for knee or hip replacement patients, other acute and chronic conditions may result in surgery such as inflammatory conditions (rheumatoid arthritis), osteonecrosis, congenital disorders, femoral neck fracture, or sequelae following traumatic ligamentous injuries. There is no consensus about the optimal system to record the joint diagnosis (Bijlsma et al. Citation2011). Furthermore, some patients have undergone previous joint interventions of various kinds, such as arthroscopic procedures or osteotomy, and there is no consistent convention to document previous procedures. Research is needed to refine the categorization of joint pathology and previous procedures that may be important in case-mix adjustment. The ISAR PROMs Working Group recommends documentation of diagnosis such as primary osteoarthritis or other joint conditions.

Medical and musculoskeletal comorbidities are associated with varied patient-reported outcomes after joint replacement (Dieppe and Lohmander Citation2005). Administrative or billing data include lists of coexisting medical conditions that may be used for risk adjustment in summarized measures (such as Charlson comorbidity index, Elixhauser comorbidity measure, or number of comorbidities) or specific conditions (such as presence or absence of diabetes mellitus). Depending on the condition, these measures have the potential to improve prediction of postoperative pain and function. However, the role of medical risk factors may be small in understanding the variation in PROMs (Greene et al. Citation2015). Furthermore, administrative medical comorbidity coding may differ between centers and countries, and may be affected by the quality of data entry and reimbursement procedures. Although medical comorbidities are used regularly in case-mix adjustment, musculoskeletal comorbidities and risky behaviors such as smoking and alcohol use are less often included (Bijlsma et al. Citation2011). These comorbidity measures are not documented consistently in registries or administrative data (Greene et al. Citation2015). As recognized in the Charnley classification for grading of disabilities, outcomes from hip arthroplasty may be affected by the severity of pain in the contralateral hip, total osteoarthritis burden, and the presence of systemic disease that interferes with function, and these factors may also affect postoperative outcomes after knee arthroplasty (Jenkins et al. Citation2011, NHS England Analytical Team Citation2013, Greene et al. Citation2015). Although the Charnley classification has traditionally been assessed by the surgeon, a literature search did not show any study on the interobserver reliability of the Charnley classification. In the Swedish Hip and Knee Arthroplasty Registers, the classification is used as reported by the patient () (Rolfson et al. Citation2011a, NHS England Analytical Team Citation2013). Coexisting contralateral knee osteoarthritis and low back pain have also been reported to affect pain and function after total joint replacement (Ayers et al. Citation2013). Moreover, the American Society of Anesthesiologists Physical Status Classification (ASA) is often documented in operative records and has been associated with PROMs after joint replacement (Dunbar et al. Citation2004).

In summary, medical and musculoskeletal comorbidities influence postoperative PROMs and they should be included in research and refined comparisons of outcomes between groups. However, inconsistent and incomplete documentation may limit the use of these comorbidity measures in routine postoperative risk-adjustment models. Overall health preoperatively (the sum of comorbidities) may be determined from patient-reported general health surveys such as EQ-5D, SF-12, or SF-36, and should be included as a factor in case-mix adjustment models (Browne et al. Citation2008). The quality of the case-mix variables depends on the completeness and accuracy of the data. The data may be collected from patients, physicians, or healthcare systems.

There are pitfalls associated with insufficient case-mix adjustment (residual confounding), differences in the way case-mix variables are assessed between centers, and the constant risk fallacy (Nicholl Citation2007) that occurs when the relation between the risk factor and outcome differs between groups. In addition, patient factors may interact with care provided and may differ between hospitals or countries.

The ISAR PROMs Working Group recommends that age, sex, diagnosis at joint, and preoperative health status (pain and function) score should be included in case-mix adjustment models to ensure valid comparisons of postoperative PROMs between clinical settings or countries. In addition, case-mix adjustment models should include preoperative PROMs, education level (as a measure of socioeconomic status), and Charnley classification (as a measure of comorbidity). Additional research is necessary to evaluate the usefulness and performance of other patient factors in case-mix adjustment models.

Interpretation and statistical analysis

Improvement or absolute level of pain, function, and health status

The aim of arthroplasty is to return patients to the highest possible level of pain relief, function, and health status. Patients who have severe symptoms before intervention have the greatest potential for improvement. However, patients who report the worst preoperative health states may not have as good an outcome after surgery than patients who report fewer problems before surgery (Vogl et al. Citation2014). Furthermore, the bounded design of instruments may limit differentiation between good and excellent outcomes. Evaluation of outcomes from the patient standpoint must therefore consider the amount of change as well as the absolute level of pain, function, and HRQoL at a specific time after surgery.

There is a strong positive association between pre- and postoperative PROMs scores. The magnitude of change in PROMs scores that occurs from before to after joint replacement surgery may cause incorrect conclusions about the quality of healthcare providers. It is quite probable that different surgeons would operate on patients who have different preoperative levels of severity, due in part to healthcare funding, surgeon training, and referral patterns. Arthroplasty surgeons and researchers are therefore encouraged to engage in the organization of the care of arthritis symptoms and arthroplasty. 

Missing data—categories and causes

The goal of arthroplasty PROMs analysis is to make valid inferences and learn from the data to improve arthroplasty care. Missing data can introduce bias. In prospective collection of registry data, there are often missing data and this cannot be entirely avoided. The cause of missing data must therefore be considered before starting the analysis.

Missing data can be classified into 3 categories (Little and Rubin Citation2002): (1) missing completely at random (MCAR), where the reason for missing data is unrelated to any outcome of interest; (2) missing at random (MAR), where the reason for missing data depends on known covariates; and (3) missing not at random (MNAR), where the reason for missing data is associated with unknown or unmeasured covariates.

Patients who have missing data may be systematically different from patients with complete information. It is important to understand the cause of the missing data and to differentiate between missing data due to item non-response (failure to complete an item) and missing data due to unit non-response (where the patient is lost to follow-up). Item non-response in PROMs is frequently observed and associated with difficulty, relevance, and importance of the question to the patient. In HOOS and KOOS, sports-domain and recreation-domain questions are the most commonly skipped items. When the outcome depends on a patient’s expectations in performing high-impact activites, the effect on these surveys may be biased unless the expectation in sporting activities is considered in the analysis. Unit non-response occurs when patients are lost to follow-up for administrative reasons or because of comorbidities, unwillingness to provide additional information, or death. Thus, variables associated with missing data should be identified and controlled for in the analysis.

The best method of addressing missing data is to minimize item and unit non-response during data collection. This can be achieved by making questionnaires as short as possible, ensuring that the PROMs are appropriate for and relevant to the population of interest, and using questions that are easily interpreted. Strategies that maximize frequency of response for collection of postoperative data may also be considered, such as envelope teasers, recorded delivery, monetary incentives, shorter questionnaires, pre-notification, and follow-up contact (Edwards et al. Citation2002, Citation2007).

Missing data—strategies

It is important to understand the consequences of missing data and to adopt an appropriate analytical strategy. When data are MCAR or MAR, complete case analysis (list-wise deletion) can be used to generate estimates with minimal bias, despite the inefficiency associated with calculation of standard errors. However, when attempting to recover inefficiency, other methods may be employed.

With MCAR or MAR assumptions, analyses such as multiple imputation and full information maximum likelihood (FIML) make allowances for missing data and recover inefficiencies associated with complete case analysis. In multiple imputation, realistic guesses are made about the missing value(s), and the uncertainty is propagated to the analyses to correct the standard errors. In FIML, parameters and standard errors are estimated in 1 step. Multiple imputation and FIML routines are available in most statistical software packages. Although multiple imputation is a general method that can be used with most estimation methods, FIML is used with structural equation modeling. However, modern structural equation modeling programs enable a wide class of generalized linear models and may be implemented easily.

Despite the wide availability of multiple imputation and FIML methods, several older single-imputation methods are frequently used but are not recommended. These include unconditional mean, conditional mean, and last observation carried forward (LOCF). These methods perform poorly, may introduce substantial bias, and underestimate standard errors. Current practices for handling of the problem of missing data have been reviewed (Graham Citation2009).

Analysis approaches and case-mix adjustment

Descriptive statistics should be reported—including mean, standard deviation, median, and interquartile range for continuous variables and frequency, proportion, and rate for categorical variables. Both preoperative and follow-up information should be summarized when a PROM is the primary outcome of interest. The initial state and progression of PROMs should be included in analyses. The role of confounding and case mix should be considered in PROMs data analysis, and adjustment for confounding and case-mix factors should be guided by causal knowledge (Hernán et al. Citation2002). Confounding and case-mix factors may be adjusted (1) in the design, with restriction or matching techniques such as propensity or radius, or (2) in the analysis, using inverse probability weighting, stratification, restriction, and multivariable regression.

The clinical relevance of changes in patient-reported outcome measures

A common question concerns the clinical relevance of changes in PROMs. Although every measure should be interpreted by defining thresholds for perceived improvement or deterioration, these thresholds may be difficult to define. Several methods are available for this purpose, but there is no consensus about the optimal method (King Citation2011). The minimal change or difference estimates calculated for a specific PROMs instrument may vary with the method used, intervention features, population characteristics, and instrument range. There is concern about universal thresholds of minimum important difference, and more research is needed on the application of such values in arthroplasty outcomes evaluation.

Statistical testing has been used to compare providers. However, it may be more informative to supply proportional data about patients improving or deteriorating, including well-grounded thresholds for minimum important differences.

Minimal clinically important difference and minimal detectable change

Clinicians and researchers rely on PROMs to assess and compare treatments and make decisions for clinical practice. However, numerical PROMs scores may lack direct clinically relevant meaning. Thus, minimal clinically important difference (MCID) and minimal detectable change (MDC) have been developed to represent the threshold needed to define treatment effectiveness (Rolfson et al. 2015). A PROM score that reaches MCID or MDC confirms clinical relevance and provides justification for implementation in clinical practice.

There is no standard MCID or MDC, because the MCID or MDC is specific to different PROMs, conditions, and populations. The MCID and MDC should be interpreted with caution and should consider measurement error for the PROM. This may be estimated with the MDC90, which is an MDC estimate that has a conventional confidence level of 90% (Kennedy et al. Citation2005, Kovacs et al. Citation2008). The interpretation of the MDC90 is that 90% of truly stable patients will show random variation of less than this magnitude when assessed on several occasions.

Involvement of statisticians and epidemiologists

Statistical analysis and interpretation of PROMs data is complex. The ISAR PROMs Working Group therefore recommends that registries should involve adequately trained biostatisticians and/or epidemiologists in processing and publishing PROMs data.

Future directions

Integration of patient-reported outcome measures into clinical decision-making

Discussions about risks and expected benefits of medical interventions are an integral component of the shared decision-making process in the encounter between healthcare professionals and patients (Makoul et al. Citation2006). These discussions provide patients with informed choices about treatment options that are based on the best evidence currently available (Lurie and Weinstein Citation2001). Although widely endorsed, the implementation of shared decision-making in clinical practice is limited, in part because of the lack of tools to display evidence and support the process (Weinstein et al. Citation2007). The decision to proceed with arthroplasty should be based on a most comprehensive benefit-risk assessment and on the patient’s preference. The benefit-risk tool should be derived from multiple measures including information about expected PROMs (Dieppe et al. Citation2011). PROMs data collected in registries will help to establish these decision-making tools, and early developments are in progress (Greene Citation2015).

Assessment of the usefulness of patient-reported out­come measures in post-marketing implant surveillance

Only a few registry-based studies have used PROMs in the comparison of different types of implants (Lübbeke et al. Citation2014) and surgical techniques (Jameson et al. Citation2014a, Lindgren et al. Citation2014). In the context of the urgently needed improvement of post-marketing implant surveillance, it is important to determine the value of PROMs as an early indicator of implant failure.

Computer-adaptive testing

The US National Institutes of Health (NIH) have initiated the Patient-Reported Outcomes Measurement Information System (PROMIS) to standardize PROMs used in studies that are funded by the NIH (Patient-Reported Outcomes Measurement Information System Citation2015). Some of the PROMIS measures use computer-adaptive testing (CAT) to minimize questionnaire burden on patients and to avoid floor and ceiling effects. Large banks of questions have been established for particular health status domains or constructs. The CAT survey pulls individual questions from the item bank. Based on the response to previous questions, the CAT system assigns the next question using item response theory, to expose a minimum set of relevant questions to the respondent and to create a summary score for the domain. The current feasibility of using CAT surveys with PROMs in arthroplasty registries is questionable, because CAT surveys are computer-based and many registries use paper-based forms. Furthermore, the content validity of CAT surveys has not been established for arthroplasty patients. Further research is needed on the validity of using CAT surveys with arthroplasty patients before adoption by registries and comparison with previously reported outcomes.

International comparisons

Harmonization of basic elements of PROMs collection and reporting in arthroplasty registries worldwide will enable international comparisons. There are international differences in culture, provision of healthcare, response patterns, priorities, and needs. International comparisons have the potential to identify successful practices. Although it is not feasible to suggest a universal standard for a complete set of relevant measures, agreement about a minimum set of PROMs and case-mix variables will facilitate international comparisons. Comprehensive measures may be compared by established crosswalk algorithms between different instruments.

Expansion of measures into the full continuum of care

Arthroplasty is a subset of care available for degenerative hip or knee conditions. It is desirable to monitor pain, function, and general health status for the continuum of care of degenerative joint disease before and after joint replacement surgery. Broader monitoring may improve our understanding about timing of surgery, arthroplasty indication, trajectories of patients who are not candidates for joint replacement, and factors associated with successful disease management.

The ISAR PROMs Working Group thanks Adrian Sayers for valuable contributions to this report. We also thank Elly Trepman for help in reorganizing the original internal report of the ISAR PROMs Working Group into 2 manuscripts.

All the authors participated in the conception of the study. OR, ALW, EB, PF, SL, and GD drafted the manuscript. All the authors were involved in critical revision of the article.

  • Ayers D C, Li W, Oatis C, Rosal M C, Franklin P D. Patient-reported outcomes after total knee replacement vary on the basis of preoperative coexisting disease in the lumbar spine and other nonoperatively treated joints: the need for a musculoskeletal comorbidity index. J Bone Joint Surg Am 2013; 95(20): 1833–7.
  • Baker P N, van der Meulen J H, Lewsey J, Gregg P J, National Joint Registry for England and Wales. The role of pain and function in determining patient satisfaction after total knee replacement. Data from the National Joint Registry for England and Wales. J Bone Joint Surg Br 2007; 89(7): 893–900.
  • Baker P N, Rushton S, Jameson S S, Reed M, Gregg P, Deehan D J. Patient satisfaction with total knee replacement cannot be predicted from pre-operative variables alone: a cohort study from the National Joint Registry for England and Wales. Bone Joint J 2013; 95-B(10): 1359–65.
  • Barton G R, Sach T H, Avery A J, Doherty M, Jenkinson C, Muir K R. Comparing the performance of the EQ-5D and SF-6D when measuring the benefits of alleviating knee pain. Cost Eff Resour Alloc 2009; 7: 12. doi: 10.1186/1478-7547-7-12.
  • Bellamy N, Buchanan W W, Goldsmith C H, Campbell J, Stitt L W. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol 1988; 15(12): 1833–40.
  • Bijlsma J W, Berenbaum F, Lafeber F P. Osteoarthritis: an update with relevance for clinical practice. Lancet 2011; 377(9783): 2115–26.
  • Bjorgul K, Novicoff W M, Saleh K J. Evaluating comorbidities in total hip and knee arthroplasty: available instruments. J Orthop Traumatol 2010; 11(4): 203–9.
  • Boston University School of Public Health. VR-36, VR-12 and VR-6D. Boston University School of Public Health Web site. http://www.bu.edu/sph/research/research-landing-page/vr-36-vr-12-and-vr-6d/. 2015.
  • Bourne R B, Chesworth B M, Davis A M, Mahomed N N, Charron K D. Patient satisfaction after total knee arthroplasty: who is satisfied and who is not? Clin Orthop Relat Res 2010; 468(1): 57–63.
  • Brander V, Gondek S, Martin E, Stulberg S D. Pain and depression influence outcome 5 years after knee replacement surgery. Clin Orthop Relat Res 2007; 464: 21–6.
  • Breivik E K, Björnsson G A, Skovlund E. A comparison of pain rating scales by sampling from clinical trial data. Clin J Pain 2000; 16(1): 22–8.
  • Browne J, Jamieson L, Lewsey J, van der Meulen J, Copley L, Black N. Case-mix & patients’ reports of outcome in Independent Sector Treatment Centres: comparison with NHS providers. BMC Health Serv Res 2008; 8: 78.
  • Browne J P, Bastaki H, Dawson J. What is the optimal time point to assess patient-reported recovery after hip and knee replacement? A systematic review and analysis of routinely reported outcome data from the English patient-reported outcome measures programme. Health Qual Life Outcomes 2013; 11: 128.
  • Clement N D, Macdonald D, Burnett R. Predicting patient satisfaction using the Oxford knee score: where do we draw the line? Arch Orthop Trauma Surg 2013; 133(5): 689–94.
  • Davis A M, Perruccio A V, Canizares M, Tennant A, Hawker G A, Conaghan P G, Roos E M, Jordan J M, Maillefert J F, Dougados M, Lohmander L S. The development of a short measure of physical function for hip OA HOOS-Physical Function Shortform (HOOS-PS): an OARSI/OMERACT initiative. Osteoarthritis Cartilage 2008; 16(5): 551–9.
  • Dawson J, Fitzpatrick R, Carr A, Murray D. Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br 1996a; 78(2): 185–90.
  • Dawson J, Fitzpatrick R, Murray D, Carr A. Comparison of measures to assess outcomes in total hip replacement surgery. Qual Health Care 1996b; 5(2): 81–8.
  • Dawson J, Fitzpatrick R, Murray D, Carr A. Questionnaire on the perceptions of patients about total knee replacement. J Bone Joint Surg Br 1998; 80(1): 63–9.
  • Dawson J, Rogers K, Doll H, Fitzpatrick R, Cooper C, Carr A. Using patient-reported outcome measures (PROMs) routinely: an example in the context of elective shoulder surgery. Open Epidemiology Journal 2010; 3: 42–53.
  • Devane P, Horne G, Gehling D J. Oxford hip scores at 6 months and 5 years are associated with total hip revision within the subsequent 2 years. Clin Orthop Relat Res 2013; 471(12): 3870–4.
  • Dieppe P A, Lohmander L S. Pathogenesis and management of pain in osteoarthritis. Lancet 2005; 365(9463): 965–73.
  • Dieppe P, Lim K, Lohmander S. Who should have knee joint replacement surgery for osteoarthritis? Int J Rheum Dis 2011; 14(2): 175–80.
  • Dunbar M J, Robertsson O, Ryd L. What’s all that noise? The effect of co-morbidity on health outcome questionnaire results after knee arthroplasty. Acta Orthop Scand 2004; 75(2): 119–26.
  • Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, Kwan I. Increasing response rates to postal questionnaires: systematic review. BMJ 2002; 324(7347): 1183.
  • Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, Kwan I, Cooper R. Methods to increase response rates to postal questionnaires. Cochrane Database Syst Rev 2007; 2: MR000008.
  • EuroQol Group. EQ-5D. EuroQol Group Web site. http://www.euroqol.org. 2015.
  • EuroQol Group. EuroQol – a new facility for the measurement of health-related quality of life. Health Policy 1990; 16(3): 199–208.
  • Franklin PD, Li W, Ayers DC. The Chitranjan Ranawat Award. Functional outcome after total knee replacement varies with patient attributes. Clin Orthop Relat Res. 2008; 466(11): 2597–604.
  • Garellick G, Kärrholm J, Rogmark C, Herberts P. Swedish Hip Arthroplasty Register, annual report 2008, shortened version. Gothenburg, Sweden: Swedish Hip Arthroplasty Register, Department of Ortopaedics, Sahlgrenska University Hospital, 2009.
  • Garellick G, Kärrholm J, Rogmark C, Herberts P. Swedish Hip Arthroplasty Register, annual report 2009, shortened version. Gothenburg, Sweden: Swedish Hip Arthroplasty Register, Department of Ortopaedics, Sahlgrenska University Hospital, 2010.
  • Gliklich R E, Dreyer N A, Leavy M B, editors. Registries for evaluating patient outcomes: a user’s guide, 3rd edition. Rockville, MD: Agency for Healthcare Research and Quality; 2014.
  • Gordon M, Paulsen A, Overgaard S, Garellick G, Pedersen A B, Rolfson O. Factors influencing health-related quality of life after total hip replacement – a comparison of data from the Swedish and Danish hip arthroplasty registries. BMC Musculoskelet Disord 2013; 14: 316.
  • Gordon M, Greene M, Frumento P, Rolfson O, Garellick G, Stark A. Age- and health-related quality of life after total hip replacement: decreasing gains in patients above 70 years of age. Acta Orthop 2014a; 85(3): 244–9.
  • Gordon M, Frumento P, Sköldenberg O, Greene M, Garellick G, Rolfson O. Women in Charnley class C fail to improve in mobility to a higher degree after total hip replacement. Acta Orthop 2014b; 85(4): 335–41.
  • Graham J W. Missing data analysis: making it work in the real world. Annu Rev Psychol 2009; 60: 549–76.
  • Greene M E. Who should have total hip replacement? Use of patient-reported outcome measures in identifying the indications for and assessment of total hip replacement [thesis]. Gothenburg: University of Gothenburg; http://hdl.handle.net/2077/38458, 2015.
  • Greene M E, Rolfson O, Nemes S, Gordon M, Malchau H, Garellick G. Education attainment is associated with patient-reported outcomes: findings from the Swedish Hip Arthroplasty Register. Clin Orthop Relat Res 2014a; 472(6): 1868–76.
  • Greene M E, Rader K A, Garellick G, Malchau H, Freiberg A A, Rolfson O. The EQ-5D-5L improves on the EQ-5D-3L for health-related quality-of-life assessment in patients undergoing total hip arthroplasty. Clin Orthop Relat Res 2014b Dec 9 [Epub ahead of print].
  • Greene M E, Rolfson O, Gordon M, Garellick G, Nemes S. Standard comorbidity measures do not predict patient-reported outcomes 1 year after total hip arthroplasty. Clin Orthop Relat Res 2015 Feb 21 [Epub ahead of print].
  • Hamilton D F, Lane J V, Gaston P, Patton J T, Macdonald D, Simpson A H, Howie C R. What determines patient satisfaction with surgery? A prospective cohort study of 4709 patients following total joint replacement. BMJ Open 2013; 3(4).
  • Hawker G A, Mian S, Kendzerska T, French M. Measures of adult pain: Visual Analog Scale for Pain (VAS Pain), Numeric Rating Scale for Pain (NRS Pain), McGill Pain Questionnaire (MPQ), Short-Form McGill Pain Questionnaire (SF-MPQ), Chronic Pain Grade Scale (CPGS), Short Form-36 Bodily Pain Scale (SF-36 BPS), and Measure of Intermittent and Constant Osteoarthritis Pain (ICOAP). Arthritis Care Res (Hoboken) 2011; 63 (suppl 11): S240–S52.
  • Herdman M, Gudex C, Lloyd A, Janssen M, Kind P, Parkin D, Bonsel G, Badia X. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res 2011; 20(10): 1727–36.
  • Hernán M A, Hernández-Díaz S, Werler M M, Mitchell A A. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol 2002; 155(2): 176–84.
  • Hjermstad M J, Fayers P M, Haugen D F, Caraceni A, Hanks G W, Loge J H, Fainsinger R, Aass N, Kaasa S, European Palliative Care Research Collaborative (EPCRC). Studies comparing Numerical Rating Scales, Verbal Rating Scales, and Visual Analogue Scales for assessment of pain intensity in adults: a systematic literature review. J Pain Symptom Manage 2011; 41(6): 1073–93.
  • Hooper G J, Rothwell A G, Hooper N M, Frampton C. The relationship between the American Society of Anesthesiologists physical rating and outcome following total hip and knee arthroplasty: an analysis of the New Zealand Joint Registry. J Bone Joint Surg Am 2012; 94(12): 1065–70.
  • International Consortium for Health Outcomes Measurement. Hip and Knee Osteoarthistis Standard Set. http://www.ichom.org. 2015
  • International Society of Arthroplasty Registries. Bylaws (revised March 2013). International Society of Arthroplasty Registries Web site. http://www.isarhome.org/statements. 2015.
  • Jameson S S, Mason J, Baker P, Gregg P J, McMurtry I A, Deehan D J, Reed M R. A comparison of surgical approaches for primary hip arthroplasty: a cohort study of patient reported outcome measures (PROMs) and early revision using linked national databases. J Arthroplasty 2014a; 29(6): 1248–55 e1.
  • Jameson S S, Mason J M, Baker P N, Elson D W, Deehan D J, Reed M R. The impact of body mass index on patient reported outcome measures (PROMs) and complications following primary hip arthroplasty. J Arthroplasty 2014b; 29(10): 1889–98.
  • Jenkins P J, Duckworth A D, Robertson F P, Howie C R, Huntley J S. Profiles of biomarkers of excess alcohol consumption in patients undergoing total hip replacement: correlation with function. ScientificWorldJournal 2011; 11: 1804–11.
  • Jones C A, Voaklander D C, Johnston D W, Suarez-Almazor M E. The effect of age on pain, function, and quality of life after total hip and knee arthroplasty. Arch Intern Med 2001; 161(3): 454–60.
  • Journal of the American Medical Association. JAMA Instructions For Authors. JAMA Network Web site. http://jama.jamanetwork.com/public/instructionsForAuthors.aspx. 2015.
  • Judge A, Cooper C, Williams S, Dreinhoefer K, Dieppe P. Patient-reported outcomes one year after primary hip replacement in a European Collaborative Cohort. Arthritis Care Res (Hoboken) 2010; 62(4): 480–8.
  • Judge A, Arden N K, Price A, Glyn-Jones S, Beard D, Carr A J, Dawson J, Fitzpatrick R, Field R E. Assessing patients for joint replacement: can pre-operative Oxford hip and knee scores be used to predict patient satisfaction following joint replacement surgery and to guide patient selection? J Bone Joint Surg Br 2011; 93(12): 1660–4.
  • Judge A, Arden NK, Kiran A, Price A, Javaid MK, Beard D, Murray D, Field RE. Interpretation of patient-reported outcomes for hip and knee replacement surgery: identification of thresholds associated with satisfaction with surgery. J Bone Joint Surg Br 2012a; 94(3): 412–8.
  • Judge A, Arden N K, Cooper C, Kassim Javaid M, Carr A J, Field R E, Dieppe P A. Predictors of outcomes of total knee replacement surgery. Rheumatology (Oxford) 2012b; 51(10): 1804–13.
  • Judge A, Batra R N, Thomas G E, Beard D, Javaid M K, Murray D W, Dieppe P A, Dreinhoefer K E, Peter-Guenther K, Field R, Cooper C, Arden N K. Body mass index is not a clinically meaningful predictor of patient reported outcomes of primary hip replacement surgery: prospective cohort study. Osteoarthritis Cartilage 2014; 22(3): 431–9.
  • Kay A, Davison B, Badley E, Wagstaff S. Hip arthroplasty: patient satisfaction. Br J Rheumatol 1983; 22(4): 243–9.
  • Kennedy D M, Stratford P W, Wessel J, Gollish J D, Penney D. Assessing stability and change of four performance measures: a longitudinal study evaluating outcome following total hip and knee arthroplasty. BMC Musculoskelet Disord 2005; 6: 3.
  • King M T. A point of minimal important difference (MID): a critique of terminology and methods. Expert Rev Pharmacoecon Outcomes Res 2011; 11(2): 171–84.
  • Kosinski M, Keller S D, Hatoum H T, Kong S X, Ware J E Jr. The SF-36 Health Survey as a generic outcome measure in clinical trials of patients with osteoarthritis and rheumatoid arthritis: tests of data quality, scaling assumptions and score reliability. Med Care 1999; 37(5 suppl): MS10–22.
  • Kovacs F M, Abraira V, Royuela A, Corcoll J, Alegre L, Tomás M, Mir M A, Cano A, Muriel A, Zamora J, Del Real M T, Gestoso M, Mufraggi N, Spanish Back Pain Research Network. Minimum detectable and minimal clinically important changes for pain in patients with nonspecific neck pain. BMC Musculoskelet Disord 2008; 9: 43.
  • Le Q A. Probabilistic mapping of the health status measure SF-12 onto the health utility measure EQ-5D using the US-population-based scoring models. Qual Life Res 2014; 23(2): 459–66.
  • Liao C Y, Chan H T, Chao E, Yang C M, Lu T C. Comparison of total hip and knee joint replacement in patients with rheumatoid arthritis and osteoarthritis: a nationwide, population-based study. Singapore Med J 2015; 56(1): 58–64.
  • Lindgren J V, Wretenberg P, Kärrholm J, Garellick G, Rolfson O. Patient-reported outcome is influenced by surgical approach in total hip replacement: a study of the Swedish Hip Arthroplasty Register including 42,233 patients. Bone Joint J 2014; 96-B(5): 590–6.
  • Little R J A, Rubin D B. Statistical analysis with missing data, 2nd ed. Hoboken, NJ: John Wiley & Sons: 2002.
  • Lübbeke A, Stern R, Garavaglia G, Zurcher L, Hoffmeyer P. Differences in outcomes of obese women and men undergoing primary total hip arthroplasty. Arthritis Rheum 2007; 57(2): 327–34.
  • Lübbeke A, Gonzalez A, Garavaglia G, Roussos C, Bonvin A, Stern R, Peter R, Hoffmeyer P. A comparative assessment of small-head metal-on-metal and ceramic-on- polyethylene total hip replacement. Bone Joint J 2014; 96-B(7): 868–75.
  • Lurie J D, Weinstein J N. Shared decision-making and the orthopaedic workforce. Clin Orthop Relat Res 2001; 385: 68–75.
  • Mahomed N, Gandhi R, Daltroy L, Katz J N. The self-administered patient satisfaction scale for primary hip and knee arthroplasty. Arthritis 2011; 2011: 591253.
  • Makoul G, Clayman M L. An integrative model of shared decision making in medical encounters. Patient Educ Couns 2006; 60(3): 301–12.
  • Mancuso C A, Salvati E A, Johanson N A, Peterson M G, Charlson M E. Patients’ expectations and satisfaction with total hip arthroplasty. J Arthroplasty 1997; 12(4): 387–96.
  • Mancuso CA, Sculco TP, Wickiewicz TL, Jones EC, Robbins L, Warren RF, Williams-Russo P. Patients’ expectations of knee surgery. J Bone Joint Surg Am 2001; 83-A(7): 1005–12.
  • Maratt J D, Lee Y Y, Lyman S, Westrich G H. Predictors of satisfaction following total knee arthroplasty. J Arthroplasty 2015; 30(7):1142–5.
  • MarketingCharts. Generational Differences in Consumers’ Screen Preferences. MarketingCharts Web site. http://www.marketingcharts.com/traditional/generational-differences-in-consumers-screen-preferences51119/?utm_campaign=rssfeed&utm_source=mc&utm_medium=textlink. 2015.
  • McConnell S, Kolopack P, Davis A M. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC): a review of its utility and measurement properties. Arthritis Rheum 2001; 45(5): 453–61.
  • Merle-Vincent F, Couris C M, Schott A M, Conrozier T, Piperno M, Mathieu P, Vignon E, Osteoarthritis Section of the French Society for Rheumatology. Factors predicting patient satisfaction 2 years after total knee arthroplasty for osteoarthritis. Joint Bone Spine 2011; 78(4): 383–6.
  • Mourão A F, Amaral M, Caetano-Lopes J, Isenberg D. An analysis of joint replacement in patients with systemic lupus erythematosus. Lupus 2009; 18(14): 1298–302.
  • Murray D W, Fitzpatrick R, Rogers K, Pandit H, Beard D J, Carr A J, Dawson J. The use of the Oxford hip and knee scores. J Bone Joint Surg Br 2007; 89(8): 1010–4.
  • Neuburger J, Hutchings A, Allwood D, Black N, van der Meulen J H. Sociodemographic differences in the severity and duration of disease amongst patients undergoing hip or knee replacement surgery. J Public Health (Oxf) 2012; 34(3): 421–9.
  • Neuburger J, Hutchings A, Black N, van der Meulen J H. Socioeconomic differences in patient-reported outcomes after a hip or knee replacement in the English National Health Service. J Public Health (Oxf) 2013; 35(1): 115–24.
  • NHS England Analytical Team. Patient reported outcome measures: update to reporting and case-mix adjusting hip and knee procedure data. London: National Health Service England; 2013.
  • Nicholl J. Case-mix adjustment in non-randomised observational evaluations: the constant risk fallacy. J Epidemiol Community Health 2007; 61(11): 1010–3.
  • Nilsdotter A K, Lohmander L S, Klässbo M, Roos E M. Hip disability and osteoarthritis outcome score (HOOS) – validity and responsiveness in total hip replacement. BMC Musculoskelet Disord 2003; 4: 10.
  • Noble P C, Conditt M A, Cook K F, Mathis K B. The John Insall Award: Patient expectations affect satisfaction with total knee arthroplasty. Clin Orthop Relat Res 2006; 452: 35–43.
  • Obradovic M, Lal A, Liedgens H. Validity and responsiveness of EuroQol-5 dimension (EQ-5D) versus Short Form-6 dimension (SF-6D) questionnaire in chronic pain. Health Qual Life Outcomes 2013; 11: 110.
  • Optum Inc. SF-36.org: a community for measuring health outcomes using SF tools. Optum Inc. Web site. http://www.sf-36.org. 2015.
  • Organisation for Economic Co-operation and Development (OECD). Classifying educational programmes: manual for ISCED-97 implementation in OECD Countries. Paris: OECD; 1999.
  • Patient-Reported Outcomes Measurement Information System. PROMIS Overview. PROMIS Web site. http://www.nihpromis.org/about/overview. 2015
  • Perruccio A V, Stefan Lohmander L, Canizares M, Tennant A, Hawker G A, Conaghan P G, Roos E M, Jordan J M, Maillefert J F, Dougados M, Davis A M. The development of a short measure of physical function for knee OA KOOS-Physical Function Shortform (KOOS-PS) – an OARSI/OMERACT initiative. Osteoarthritis Cartilage 2008; 16(5): 542–50.
  • Riddle D L, Wade J B, Jiranek W A, Kong X. Preoperative pain catastrophizing predicts pain outcome after knee arthroplasty. Clin Orthop Relat Res 2010; 468(3): 798–806.
  • Robertsson O, Dunbar M, Pehrsson T, Knutson K, Lidgren L. Patient satisfaction after knee arthroplasty: a report on 27,372 knees operated on between 1981 and 1995 in Sweden. Acta Orthop Scand 2000; 71(3): 262–7.
  • Robertsson O, Dunbar M J. Patient satisfaction compared with general health and disease-specific questionnaires in knee arthroplasty patients. J Arthroplasty 2001; 16(4): 476–82.
  • Rolfson O, Dahlberg L E, Nilsson J A, Malchau H, Garellick G. Variables determining outcome in total hip replacement surgery. J Bone Joint Surg Br 2009; 91(2): 157–61.
  • Rolfson O, Kärrholm J, Dahlberg L E, Garellick G. Patient-reported outcomes in the Swedish Hip Arthroplasty Register: results of a nationwide prospective observational study. J Bone Joint Surg Br 2011a; 93(7): 867–75.
  • Rolfson O, Rothwell A, Sedrakyan A, Chenok K E, Bohm E, Bozic K J, Garellick G. Use of patient-reported outcomes in the context of different levels of data. J Bone Joint Surg Am 2011b; 93 (suppl 3): 66–71.
  • Rolfson O, Salomonsson R, Dahlberg L E, Garellick G. Internet-based follow-up questionnaire for measuring patient-reported outcome after total hip replacement surgery –reliability and response rate. Value Health 2011c; 14(2): 316–21.
  • Rolfson O, Chenok KE, Bohm E, Lübbeke A, Denissen G, Dunn J, Lyman S, Franklin P, Dunbar M, Overgaard S, Garellick G, Dawson J; the Patient-Reported Outcome Measures Working Group of the International Society of Arthroplasty Registries. Patient-reported outcome measures in arthroplasty registries Report of the Patient-Reported Outcome Measures Working Group of the International Society of Arthroplasty Registries. Part I. Overview and rationale for patient-reported outcome measures. Acta Orthop 2016; (Suppl 362):
  • Roos E M, Roos H P, Lohmander L S, Ekdahl C, Beynnon B D. Knee Injury and Osteoarthritis Outcome Score (KOOS) – development of a self-administered outcome measure. J Orthop Sports Phys Ther 1998; 28(2): 88–96.
  • Schrama J C, Fenstad AM, Dale H, Havelin L, Hallan G, Overgaard S, Pedersen A B, Kärrholm J, Garellick G, Pulkkinen P, Eskelinen A, Mäkelä K, Engesæter L B, Fevang B T. Increased risk of revision for infection in rheumatoid arthritis patients with total hip replacements. Acta Orthop 2015; 86(4): 469–76.
  • SooHoo N F, Li Z, Chenok K E, Bozic K J. Responsiveness of patient reported outcome measures in total joint arthroplasty patients. J Arthroplasty 2015; 30(2): 176–91.
  • United Nations Educational, Scientific and Cultural Organization (UNESCO). International Standard Classification of Education: ISCED 1997. Paris: UNESCO; 2006.
  • Vogl M, Wilkesmann R, Lausmann C, Hunger M, Plötz W. The impact of preoperative patient characteristics on health states after total hip replacement and related satisfaction thresholds: a cohort study. Health Qual Life Outcomes 2014; 12: 108.
  • Ware J E Jr, Sherbourne C D. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care 1992; 30(6): 473–83.
  • Ware J Jr, Kosinski M, Keller S D. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care 1996; 34(3): 220–33.
  • Weinstein J N, Clay K, Morgan T S. Informed patient choice: patient-centered valuing of surgical risks and benefits. Health Aff (Millwood) 2007; 26(3): 726–30.
  • Wewers M E, Lowe N K. A critical review of visual analogue scales in the measurement of clinical phenomena. Res Nurs Health 1990; 13(4): 227–36.
  • Whitehouse S L, Lingard E A, Katz J N, Learmonth I D. Development and testing of a reduced WOMAC function scale. J Bone Joint Surg Br 2003; 85(5): 706–11.
  • Williams D P, Price A J, Beard D J, Hadfield S G, Arden N K, Murray D W, Field R E. The effects of age on patient-reported outcome measures in total knee replacements. Bone Joint J 2013; 95-B(1): 38–44.