695
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Evaluation of an assessment scale for aesthetic outcome in breast reconstructions based on digital photos in both 2D and 3D format

ORCID Icon, , , &
Pages 427-433 | Received 06 Apr 2022, Accepted 24 Nov 2022, Published online: 12 Dec 2022

Abstract

The aesthetic outcome is crucial in a breast reconstruction. Our aim was to evaluate the intra- and interrater reliability of an aesthetic outcome assessment scale with digital photos of breast reconstructions in two-dimensional (2D) and three-dimensional (3D) format. Thirty-three women with delayed breast reconstructions, consecutively participating in a five-year follow-up between November 2019 and June 2021, were included in the study. Of these, 14 were reconstructed with an expander prosthesis (EP) and 19 with a deep inferior epigastric perforator (DIEP) flap. Photos of the breasts were assessed in 2D and 3D format by expert, layman and patient panels. Data were analysed with the weighted kappa (wk) statistics. The intrarater agreements were moderate to substantial, with wk between 0.66 and 0.73 for the panels. Within the panels, the interrater agreements were 0.46–0.62. Moderate agreements were found between the matched 2D and 3D format photos (wk 0.62–0.66). The patient panel graded scar appearance worse in 3D compared with 2D format. In all panels, there was a tendency towards DIEP flap reconstructions receiving higher aesthetic outcome grades compared with EP. Thus, the aesthetic outcome assessment scale demonstrated acceptable agreements between the individual panellists and within the panels. Scars captured in 3D format may provide a greater resemblance to the reality compared with 2D. Implications for clinics remain to be further studied.

Background

In Sweden in 2020, 2405 women with breast cancer and no remote metastases underwent mastectomy [Citation1]. Mastectomy has a negative impact on women’s body image and quality of life and thus, a breast reconstruction is offered to mitigate these effects [Citation2]. A breast reconstruction may be implant-based, created from autologous tissue, or potentially a combination of the two. Individual patient characteristics and patient preferences will guide the choice of breast reconstruction method and influence the result. A satisfactory aesthetic result together with a good functional outcome are essential in a breast reconstruction. Yet there is no agreement on how best to evaluate the aesthetic outcome.

The aesthetic outcome after a breast reconstruction is often evaluated with photos using an assessment scale. A variety of assessment scales have been reported [Citation3–9]. The most common assessment scale used for professional assessment has been a four-point scale [Citation10]. In previous reports, the number and size of panels recruited to evaluate the aesthetic outcome have differed and the agreements have in many cases been poor or have not been addressed [Citation5–7,Citation11]. In addition, measurement of aesthetic outcome with panel assessment has received criticism as it is time-consuming [Citation12]. However, subjectivity is crucial for evaluating outcomes in plastic surgery as it may provide information that is not explored by objective measures. To this day, there is no gold standard assessment scale for evaluation of breast reconstructions. In a review article from 2015, the scale reported by Visser et al. was considered the most preferable as it demonstrated high validity [Citation4,Citation10]. It was, however, limited by a wide range of intra- and interrater agreements [Citation4]. An assessment scale that is reliable between assessors with similar experiences and can identify differences over time is desirable. Therefore, the main aim of this study was to evaluate the reliability of an assessment scale for aesthetic outcome in breast reconstructions. A secondary aim was to compare the aesthetic outcome following expander prosthesis (EP) breast reconstructions with deep inferior epigastric perforator (DIEP) flap breast reconstructions.

Material and methods

Patients

Thirty-four consecutive patients who had undergone unilateral delayed breast reconstruction between October 2012 and November 2016 were selected for participation in this study. The patients had been randomised to breast reconstruction with either an EP or a DIEP flap, and participated in a prospective five-year follow-up [Citation13]. The study was approved by the Ethical Review Board in Sweden Dnr 2012/187 and Dnr 2021-00555.

Photo session

Photography of the patients was performed by a professional medical photographer or by the first author, in a hospital photo studio with standardised lightning. A two-dimensional (2D) camera and a three-dimensional (3D) camera were used for documentation of the breasts. The photos in 2D format were taken with a single-lens reflex digital camera (Nikon Corporation, Tokyo, Japan). The lens used during photography was a NIKKOR lens with constant f/2.8 aperture and focal length of 24–70 mm (Nikon Corporation, Tokyo, Japan). The 3D camera system, 3dMD trio system (3dMD LLC, Atlanta, GA), had 12 fixed cameras. Of these 12 cameras, four were mounted frontally and four on both sides. Prior to each 3D photo session, the 3dMD trio system was calibrated. Subsequently, photos were taken from three angles, resulting in a photo possible to be viewed as a 3D photo in the 3dMD Vultus (version 2.2.024) program. The program enables the viewer to rotate the 3D photo, zoom in on details and conduct measurements, but was not used by the panellists in this study.

Aesthetic outcome assessment scale

The assessment scale used in this study was a modification of the scale reported by Visser et al. [Citation4]. The five items—breast size, shape, symmetry, scar appearance and nipple areolar complex (NAC)—were graded using a five-point Likert scale. The five-point Likert scale ranged from very bad (Citation1) to very good (Citation5). The item size was graded from much smaller [Citation1], equal [Citation3] to much larger [Citation5], compared with the non-reconstructed breast, different from the scale by Visser et al. The option “Cannot be evaluated” was added as a modification in the absence of a NAC. The overall aesthetic outcome was assessed using a ten-point Likert scale, very bad (Citation1) to very good (Citation10).

Panels

Three types of panellists were recruited for this study. Plastic surgeons and breast surgeons participated as experts. Only consultant and senior consultant physicians were invited. Laymen with varying degrees of medical knowledge were invited to join the layman panels. Twelve patients were invited to join a patient panel. Their participation included assessment of their own breast reconstruction.

Data collection

The study data were managed and collected using Research Electronic Data capture (REDcap) tools hosted at Lund University [Citation14,Citation15]. REDcap is a web-based platform which we used to facilitate the photo assessments. The study was performed in two phases. In the first phase, 48 sets of photos accompanied by the assessment scale were included. The same breast reconstruction appeared on two sets of photos. There were four photos per set in 2D format and five per set in 3D format ()). The sets were arranged in a randomised order. Laterality was noted but not reconstruction type. To facilitate a high response rate, the assessments could be completed at any time. The panellists were not informed in advance that the same reconstruction appeared twice, nor that two different camera modalities were used. All panellists were asked to perform the assessment twice, a minimum of three weeks apart. They were also asked to record the time it took to perform the assessment. An expert and a layman panel assessed the photos in the study’s first phase and a reliability analysis was conducted. In the second phase of the study, all breast reconstructions apart from two were replaced by new breast reconstructions. Twelve breast reconstructions were included, and in total there were 24 sets of photos. An expert, a layman and a patient panel assessed the breast reconstructions in the second phase. The assessment was performed twice by the patients and once by the other panels.

Figure 1. Examples of postoperative photographs evaluated in the study. (A) An expander prosthesis (EP) breast reconstruction in two-dimensional (2 D) format and in (B) three-dimensional (3 D) format. (C) A deep inferior epigastric perforator (DIEP) flap breast reconstruction in 2 D format and in (D) 3 D format.

Figure 1. Examples of postoperative photographs evaluated in the study. (A) An expander prosthesis (EP) breast reconstruction in two-dimensional (2 D) format and in (B) three-dimensional (3 D) format. (C) A deep inferior epigastric perforator (DIEP) flap breast reconstruction in 2 D format and in (D) 3 D format.

Statistical analysis

Statistical Package for Social Sciences version 27 (IBM Corp. Armonk, NY: IBM Corp. Released 2020) was used for statistical analysis. Intrarater and interrater agreements were calculated with the weighted kappa (wκ) and were presented as median, minimum and maximum values. The wκ was used for a reliability analysis of the assessment of digital photos in 2D format with the corresponding assessments in 3D format. Interrater reliability was presented as the median of the individually pairwise calculated kappa values. Level of agreement was interpreted as poor below 0.00, slight 0.00–0.20, fair 0.21–0.40, moderate 0.41–0.60, substantial 0.61–0.80, and almost perfect 0.81–1.00 [Citation16]. A p-value below 0.05 was considered statistically significant.

Results

Patient characteristics

Thirty-four patients completed the photography as a part of the prospective follow-up. One patient was excluded as she had undergone a contralateral breast reconstruction due to breast cancer. Thus, 33 patients were included. The photo sessions were performed between November 2019 and June 2021 at a mean of 66 (standard deviation, SD 11) months after breast reconstruction. The median age at breast reconstruction was 55 (SD 10) years. Of the included patients, 14 breasts were reconstructed with an EP and 19 with a DIEP flap.

Panel characteristics

Eleven plastic surgeons and two breast surgeons participated in the expert panel in the first phase of the study. Of these, eight were men and five were women. The age within the panel ranged from 38 to 68 years. Eleven of them performed the assessment twice. In the second phase, none of the four expert panellists had been involved in the care of the patients. The layman panel comprised of nine panellists of which four were men and five were women. Their ages ranged from 20 to 58 years. One member of the panel was a senior consultant physician working within a non-surgical specialty, two were intern physicians, and one was a medical student. The other laymen did not have any previous medical knowledge.

The median time for performing the assessment in the first phase was 60 (40–120) min for the experts and 50 (35–120) min for the laymen.

Reliability analysis

Distribution on the Likert scale

The distributions of the panels’ gradings per item are shown in . The expert and layman assessments are from the first phase of the study and the patient panel assessment from the second phase. Grades 4 and 5 were the most frequent for symmetry; however, grade 2 was the most common grade in the layman panel assessment of photos in 3D format. Regarding scar appearance, grade 4 was the most frequent grade in the expert and layman panels. However, photos in 2D format were most frequently assessed as grade 5 and photos in 3D format as grade 2 by the patient panel.

Table 1. Frequency distribution of panels’ grading from the panels’ first assessments separated by photo format for size, shape, symmetry, scar appearance and nipple areolar complex.

Table 2. Frequency distribution of panels’ grading from the panels’ first assessments separated by photo format for the overall aesthetic outcome.

Reliability for repeated assessments

In the expert panel, the intrarater agreements were moderate to substantial with a median wκ of 0.70 (0.62 − 0.75) for photos in 2D format and 0.67 (0.54 − 0.80) for photos in 3D format. In the layman panel, the agreements were moderate to substantial. The median wκ was 0.70 (0.58 − 0.74) and 0.66 (0.60 − 0.69) for the respective photo format. The patient panel had a median wκ of 0.73 (0.50 − 0.89) and 0.72 (0.29 − 0.83) respectively, assessed in the second phase of the study. The intrarater agreements are summarised in .

Table 3. Intrarater agreements with weighted kappa (κw) values.

Reliability for assessment within panels

The interrater agreements were moderate in the expert and the layman panels. In the expert panel (n = 13), the median wκ was 0.60 (0.36 − 0.74) for photos in 2D format and 0.55 (0.35 − 0.77) for photos in 3D format. In the layman panel (n = 9), the assessments resulted in a median wκ of 0.62 (0.44 − 0.73) and 0.57 (0.38 − 0.67) for photos in 2D and 3D format respectively. The agreements in the patient panel (n = 12) were somewhat lower with a median wκ of 0.46 (0.19 − 0.73) for photos in 2D format and 0.48 (0.04 − 0.73) for photos in 3D format. The interrater agreements are presented in .

Table 4. Interrater agreements with weighted kappa (κw) values.

Digital photos in 2D and 3D format

Intrarater agreements were calculated on the matched assessments of photos in 2D and 3D format. The median wκ was moderate in all panels. The median wκ was 0.64 (0.33 − 0.78) in the expert panel, 0.62 (0.46 − 0.73) in the layman panel, and 0.66 (0.25 − 0.77) in the patient panel. Separated by reconstruction method, the median wκ values were somewhat higher in assessments of DIEP flaps. The results are presented in .

Table 5. Intrarater agreements with weighted kappa (κw) between assessments in 2 D format with the corresponding in 3 D format.

Aesthetic outcome

The aesthetic outcome results are presented in . The results presented are from the second phase of the study. In all panels, there was a general tendency towards higher grades for DIEP flap breast reconstructions compared with EP. The tendency was more pronounced for the overall aesthetic outcome regarding photos in 3D format. In comparison of the panels, laymen gave the lowest median grades for overall aesthetic outcome. An in-depth review of the patients receiving lower overall outcome scores, less than or equal to 6.5, by the expert panels in phase one and two illustrated potential negative factors such as previous prosthesis exchanges (n = 3), increased body mass index (BMI) with more than four units (n = 1) and early reoperations due to complications (n = 2).

Table 6. Aesthetic outcome scores per item and photo format assessed by three panels.

Discussion

In this study, we report on the reliability of an aesthetic outcome assessment scale used for breast reconstructions. Median agreements were moderate to substantial for repeated assessments in expert, layman and patient panels. Between members of the same panel, somewhat lower median agreements were found, with the lowest values in the patient panel. In a comparison of matched photos in 2D and 3D format, moderate to substantial median agreements were demonstrated.

In the context of breast reconstructions, repeated evaluations are essential to identify changes postoperatively. For example, weight changes and implant disfiguration may alter the aesthetic result over time. Based on the findings from this study, the assessment scale demonstrated acceptable reproducibility. Compared with a study by Veiga et al., the agreements in our report were high. They presented intrarater agreements between 0.12 and 1 for photo evaluations of autologous breast reconstructions at three different time points [Citation17]. The wide agreement range presented could be explained by the scale used. It may be difficult for panellists to distinguish between adjacent grades in the presence of a scale with ten grades. However, our findings concurred with the intrarater agreements reported by Godden et al. ranging from 0.4 to 0.7, using a five-point scale [Citation18].

The use of aesthetic outcome assessment scales has been questioned, partly as a result of the high variability of interrater agreements reported in the literature. In the past, different statistical analysis methods have been used, which complicates comparisons between studies [Citation3,Citation5,Citation19]. Moreover, in some studies, reliability was not analysed [Citation6,Citation7]. Results from this study reflect moderate agreements, similar to some previous studies [Citation5,Citation19]. Lindegren et al. and Gahm et al. used the wκ and consequently, their results can be compared with ours [Citation5,Citation19]. Meanwhile, Visser et al. and Liu et al. used a different analysis method, the intraclass correlation, and presented higher agreements [Citation3,Citation4]. We found the lowest agreements in the patient panel, indicating that there may be a heterogenicity within this panel. Plausibly, the patient’s own reconstruction experience and result influences the perception of other breasts. Supposedly, a satisfied patient will assess other breast reconstructions more favourably. Although we opt for high agreements, some variability is to be expected as aesthetics are perceived differently between individuals.

The agreements between the matched assessments of photos in 2D and 3D format, together with similar intrarater agreements between the two, suggest a comparable use of the photo formats. A similar result was reported in a study evaluating cleft, lip and palate patients using 2D and 3D photos. The difference between the interrater agreements of the 2D and 3D photos was small, 0.56 and 0.62, respectively [Citation20]. We used a 3D camera system aiming for more realistic and detailed photos compared with the standard digital photos in 2D format. Our hypothesis was that the panels would grade photos in 3D format worse due to greater enhancement of scars and skin surface irregularities. We did not find a general tendency confirming this hypothesis. Interestingly, compared with the photos in 2D format, the photos in 3D format were assessed with lower grades concerning scar appearance by the patient panel. This result may be explained by scar appearance being an important outcome for patients, and therefore assessed more critically. Also, patients may have unrealistic expectations concerning the final scar appearance. This is further supported by the findings in the study by Lindegren et al. in which patients were less satisfied with the DIEP flap donor scar compared with experts [Citation5]. Photos in 3D format may provide a better reflection of the reality. By using photos in 3D format when informing patients preoperatively, more realistic expectations may be achieved. In the process of choosing the reconstruction method it is crucial that the patient is well-informed, with awareness of the possible aesthetic outcomes, as this may increase the postoperative satisfaction. In addition, a future perspective would be to evaluate the reliability between outpatient clinic assessments and 3D photo assessments.

Although a low number of patients were included in this study, the results tended to be in favour of the DIEP flap breast reconstructions due to the better aesthetic outcome. Superior aesthetic outcome in autologous reconstruction compared with implant-based reconstructions has been reported previously [Citation21–23]. The difference in aesthetic outcome between the reconstruction methods may increase with time as autologous reconstructions tend to be stable over time, unlike implants. Thus, other treatments and patient characteristics may influence the aesthetic outcome. Radiation therapy had a negative effect on the overall aesthetic outcome in a study by Huis et al. [Citation6]. In addition, higher BMI and reoperations due to complications have also been reported to negatively affect aesthetics [Citation21]. An in-depth review of our study material supports these results as some of the patients with low overall aesthetic outcome had been through reoperations due to complications, and in one case had a large increase in BMI. However, these associations must be confirmed in a larger body of material.

In concordance with previous studies, we acknowledge that the use of an assessment scale for aesthetic evaluation of breast reconstructions is time-consuming. To facilitate a high number of participating panellists, we used REDcap, which provided a more flexible way to evaluate the photos. Although there is a considerable advantage to using an electronic platform that can be accessed easily, a drawback is the possible influence of external factors. Strengths of this study are that the patients were randomised to breast reconstruction with an EP or a DIEP flap and that they were included consecutively. The panels were blinded to the reconstruction method and all reconstructions were assessed twice; in 2D and 3D format. The long follow-up time provided evaluation of breast reconstructions that were somewhat stable in their appearances. A weakness of the study is that the reliability analysis for the patient panel was based on different photos to those used with the expert and layman panels. The low number of patients included in this study is another weakness. Moreover, it is important to consider the drawback of not being able to rotate the 3D photos. This feature may have led to different results.

Conclusion

The result from this study suggested that continued use of the assessment scale in breast reconstructions could be recommended. A possible value of assessing scar appearance with photos in 3D format was found. A comparison between clinical assessments in the outpatient clinic and assessment of 3D photos is yet to be performed.

Acknowledgement

The authors are very grateful to the medical photographer Magnus Nilsson and to the panellists for their contributions to this study.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Cancer NQRfB. Immediate reconstructions at mastectomy [internet]. Stockholm: National Quality Register for Breast Cancer; 2021. https://statistik.incanet.se/brostcancer/
  • Yoon AP, Qi J, Brown DL, et al. Outcomes of immediate versus delayed breast reconstruction: results of a multicenter prospective study. Breast. 2018;37:72–79.
  • Liu T, Freijs C, Klein HJ, et al. Patients with abdominal-based free flap breast reconstruction a decade after surgery: a comprehensive long-term follow-up study. J Plast Reconstr Aesthet Surg. 2018;71(9):1301–1309.
  • Visser NJ, Damen THC, Timman R, et al. Surgical results, aesthetic outcome, and patient satisfaction after microsurgical autologous breast reconstruction following failed implant reconstruction. Plast Reconstr Surg. 2010;126(1):26–36.
  • Lindegren A, Halle M, Docherty Skogh AC, et al. Postmastectomy breast reconstruction in the irradiated breast: a comparative study of DIEP and latissimus dorsi flap outcome. Plast Reconstr Surg. 2012;130(1):10–18.
  • Huis In 't Veld EA, Long C, Sue GR, et al. Analysis of aesthetic outcomes and patient satisfaction after delayed-immediate autologous breast reconstruction. Ann Plast Surg. 2018;80(5S Suppl 5):S303–S307.
  • Ramon Y, Ullmann Y, Moscona R, et al. Aesthetic results and patient satisfaction with immediate breast reconstruction using tissue expansion: a follow-up study. Plast Reconstr Surg. 1997;99(3):686–691.
  • Teotia SS, Alford JA, Kadakia Y, et al. Crowdsourced assessment of aesthetic outcomes after breast reconstruction. Plast Reconstr Surg. 2021;147(3):570–577.
  • Eltahir Y, Bosma E, Teixeira N, et al. Satisfaction with cosmetic outcomes of breast reconstruction: investigations into the correlation between the patients’ Breast-Q outcome and the judgment of panels. JPRAS Open. 2020;24:60–70.
  • Maass SW, Bagher S, Hofer SO, et al. Systematic review: aesthetic assessment of breast reconstruction outcomes by healthcare professionals. Ann Surg Oncol. 2015;22(13):4305–4316.
  • O'Connell RL, Di Micco R, Khabra K, et al. Comparison of immediate versus delayed DIEP flap reconstruction in women who require postmastectomy radiotherapy. Plast Reconstr Surg. 2018;142(3):594–605.
  • Dahlback C, Ringberg A, Manjer J. Aesthetic outcome following breast-conserving surgery assessed by three evaluation modalities in relation to health-related quality of life. Br J Surg. 2019;106(1):90–99.
  • Tallroth L, Velander P, Klasson S. A short-term comparison of expander prosthesis and DIEP flap in breast reconstructions: a prospective randomized study. J Plast Reconstr Aesthet Surg. 2021;74(6):1193–1202.
  • Harris PA, Taylor R, Minor BL, REDCap Consortium, et al. The REDCap consortium: building an international community of software platform partners. J Biomed Inform. 2019;95:103208.
  • Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (REDCap)–a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–381.
  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174.
  • Veiga DF, Neto MS, Garcia EB, et al. Evaluations of the aesthetic results and patient satisfaction with the late pedicled TRAM flap breast reconstruction. Ann Plast Surg. 2002;48(5):515–520.
  • Godden AR, Wood SH, James SE, et al. A scoring system for 3D surface images of breast reconstruction developed using the Delphi consensus process. Eur J Surg Oncol. 2020;46(9):1580–1587.
  • Gahm J, Edsander-Nord A, Jurell G, et al. No differences in aesthetic outcome or patient satisfaction between anatomically shaped and round expandable implants in bilateral breast reconstructions: a randomized study. Plast Reconstr Surg. 2010;126(5):1419–1427.
  • Mosmuller DGM, Maal TJ, Prahl C, et al. Comparison of two- and three-dimensional assessment methods of nasolabial appearance in cleft lip and palate patients: do the assessment methods measure the same outcome? J Craniomaxillofac Surg. 2017;45(8):1220–1226.
  • Duraes EFR, Schwarz GS, de Sousa JB, et al. Factors influencing the aesthetic outcome and quality of life after breast reconstruction: a cross-sectional study. Ann Plast Surg. 2020;84(5):494–506.
  • Clough KB, O'Donoghue JM, Fitoussi AD, et al. Prospective evaluation of late cosmetic results following breast reconstruction: I . Implant reconstruction. Plast Reconstr Surg. 2001;107(7):1702–1709.
  • Clough KB, O'Donoghue JM, Fitoussi AD, et al. Prospective evaluation of late cosmetic results following breast reconstruction: II. Tram flap reconstruction. Plast Reconstr Surg. 2001;107(7):1710–1716.