APP下载

Validation and head-to-head comparison of four models for predicting malignancy of intraductal papillary mucinous neoplasm of the pancreas:A study based on endoscopic ultrasound findings

2019-12-14JieHuaBoZhangXiuJiangYangYiYinZhangMiaoYanWeiChenLiangQingCaiMengJiangLiuXianJunYuJinXuSiShi

Jie Hua,Bo Zhang,Xiu-Jiang Yang,Yi-Yin Zhang,Miao-Yan Wei,Chen Liang,Qing-Cai Meng,Jiang Liu,Xian-Jun Yu,Jin Xu,Si Shi

Jie Hua,Bo Zhang,Yi-Yin Zhang,Miao-Yan Wei,Chen Liang,Qing-Cai Meng,Jiang Liu,Xian-Jun Yu,Jin Xu,Si Shi,Department of Pancreatic Surgery,Fudan University Shanghai Cancer Center,Department of Oncology,Fudan University Shanghai Medical College,Shanghai Pancreatic Cancer Institute,Shanghai 200032,China

Xiu-Jiang Yang,Department of Endoscopy,Fudan University Shanghai Cancer Center,Shanghai 200032,China

Abstract

Key words: Intraductal papillary mucinous neoplasms; Prediction model; Endoscopic ultrasound; Mural nodules; Malignancy

INTRODUCTION

Intraductal papillary mucinous neoplasm (IPMN) of the pancreas,a type of precursor lesion of pancreatic cancer,may evolve from low-grade dysplasia to high-grade dysplasia to invasive carcinoma[1].The management of pancreatic IPMN is evolving with a better understanding of its biology and natural history[2].First formulated in 2006,the International Consensus Guidelines (ICG) for the management of IPMN have undergone two revisions[3-5].The recent version (ICG 2017) added mural nodule size as a malignancy predictor[5].However,these guidelines still result in low positive predictive value (PPV) and miss some malignant IPMNs[6,7].In addition,we are still unable to know the individual risk of malignancy,thus failing to obtain a patient’s risk-benefit profile for resectionvssurveillance.

Guidelines usually treat patients with same or similar features uniformly,while nomograms generate the individual probability of a clinical event by integrating determinant variables and therefore facilitate the advancement of personalized medicine.In 2010,Shimizuet al[8]first developed a nomogram to predict the probability of carcinoma in patients with IPMN.The included factors were sex,type of lesion,size of mural nodules,and grade of pancreatic juice cytology.The reported area under the curve (AUC) for the model was 0.903.A subsequent study using data from three high-volume centers validated its accuracy (AUC:0.760)[9].However,this model included pancreatic juice cytology,which was difficult to standardize among different centers,thus limiting its generalizability.To simplify the model while maintaining the diagnostic accuracy,a modified model supported by the Japan Pancreas Society (JPS) model was constructed on the basis of 466 patients from three centers and was further validated using 664 patients from eight centers[10].In addition,Correa-Gallego and colleagues at Memorial Sloan-Kettering Cancer Center proposed a predictive nomogram with a concordance index (C-index) of 0.74[11].Additionally,they subsequently updated the nomogram by expanding the patient population to include a dataset from three institutions from the Pancreatic Surgery Consortium(PSC) model[12].The PSC model showed a high diagnostic accuracy with C-indexes of 0.82 and 0.81 for training and validation sets,respectively.Moreover,Gemenetziset al[13]from Johns Hopkins Hospital (JHH) found that the neutrophil-to-lymphocyte ratio (NLR) was significantly elevated in patients with IPMN-associated invasive carcinoma.They introduced a nomogram (JHH model) with the inclusion of the NLR value,and the reported C-index was as high as 0.895.However,this model lacks external validation,and NLR may be affected by a wide range of systemic diseases,thus providing false indications.Furthermore,Jang and colleagues developed a nomogram for predicting the individual risk of malignancy in patients with branch duct (BD)-type IPMN on the basis of data from thirteen centers in Japan (JPN) and nine centers in Korea (KOR)[14].They reported AUCs of 0.783 and 0.737 for the training and validation sets,respectively.

Currently,whether there is a model that is superior to other models and whether it can be widely applied in clinical practice remain unknown.To address this,we performed a head-to-head comparison of four models,namely,the JPS model,the PSC model,the JHH model,and the Japan-Korea (JPN-KOR) model,for predicting the individualized probability of malignancy in patients with IPMN.

MATERIALS AND METHODS

Study cohort

Between 2010 and 2018,a total of 195 patients who underwent resection for IPMN at Fudan University Shanghai Cancer Center were identified from a prospectively maintained surgical database.Among these patients,14 with concomitant pancreatic ductal adenocarcinoma (PDAC) were excluded from analysis due to the lack of histological transition between IPMN and PDAC.Patient demographic,clinical,laboratory,and pathological data were collected.The presence of symptoms was defined as any episode of abdominal pain or discomfort.Symptoms of weight loss and jaundice were recorded separately.Serum levels of amylase,carbohydrate antigen (CA) 19-9,and carcinoembryonic antigen (CEA) were recorded from values obtained at preoperative testing.Cyst size and location,size of mural nodule/solid component,and diameter of main pancreatic duct were recorded from endoscopic ultrasound (EUS) images and report archives.Mural nodules were further subclassified into four types as proposed by Ohnoet al[15].The type of lesion was categorized as main duct (MD),BD,or mixed type according to EUS findings.All EUS examinations were performed by an experienced endoscopist (Xiu-Jiang Yang),who has over ten years of experience in the diagnosis of pancreaticobiliary diseases and performed over 1000 EUS examinations per year.The apparatuses used were EG-3270UK/EG-3870UTK electronic linear ultrasonographic endoscopes (PENTAX,Tokyo,Japan) and an HI VISION Preirus ultrasound machine (Hitachi,Tokyo,Japan).Pathological assessment was based on the highest grade of dysplasia in the resected lesion:Low- or intermediate-grade dysplasia were classified as benign,while highgrade dysplasia and invasive carcinoma (tubular carcinoma and colloid carcinoma)were classified as malignant.This study was approved by the review board of our institute,and the requirement for written informed consent was waived due to the retrospective nature of the study.

Statistical analysis

Risk score or probability of malignancy was calculated for the four prediction models.The JPS model includes three variables (mural nodule size,MD diameter,and cyst size of the BD) in a risk score table regardless of lesion type,while the PSC model has two separate nomograms for MD-IPMN and BD-IPMN with each including five variables (Supplementary Table1).The JHH model incorporates the NLR as a predictive factor into the nomogram.The JPN-KOR model has been specifically designed for predicting malignancy of BD-IPMN,and the predicted risks of malignancy are calculated with the web-based nomogram (http://statgen.snu.ac.kr/software/nomogramIPMN).These four models were compared with regard to discrimination,calibration,and clinical usefulness.Discrimination was measured by the C-index,which is interchangeable with the AUC.Calibration was assessed by comparing the model-predicted probability of malignancy with the observed probability.Clinical usefulness was estimated using decision curve analysis (DCA) to ascertain the net benefit related to the use of the four examined models.In addition,we tabulated sensitivity,specificity,PPV,negative predictive value (NPV),and diagnostic accuracy for each model and the ICG 2017 criteria.

Clinicopathological data in patients with benign versus malignant IPMN were compared using theχ2test for categorical variables and the Mann-WhitneyUtest for continuous variables.Multivariate logistic regression using the backward stepwise selection method was performed in terms of the association between variables (based on univariate significance) and malignant IPMN.Statistical analyses were performed with R software (version 3.5.0; R Foundation for Statistical Computing,Vienna,Austria).P< 0.05 was considered statistically significant.Statistical review was performed by Jia-Xin Zhou (Biomedical Informatics and Statistics Center,School of Public Health,Fudan University).

RESULTS

Of the 181 included patients,94 were categorized to have benign disease,and the remaining 87 were categorized as having malignant disease based on the final pathological analysis (Figure1).The patient characteristics are summarized in Table1.The mean patient age at resection was 61 years,and the male-to-female ratio was 1.5:1.Fifteen of seventeen (88%) patients who presented with jaundice were shown to have malignant IPMN.Patients with MD-IPMN had a significantly higher likelihood of having malignant disease than patients with BD-IPMN (MD 82.8%vsBD 9.2%;P<0.001).Mural nodule/solid component was clearly identified on EUS images in 54.1%of patients (Supplementary Figure2).Of the 98 patients with mural nodules/solid components,13 (13.3%) had type I,30 (30.6%) had type II,34 (34.7%) had type III,and 21 (21.4%) had type IV lesions.Patients with mural nodules/solid components were more likely to have malignant disease than patients without (P< 0.001).The size of the mural nodule/solid component was significantly larger in patients with malignant disease than in patients with benign disease (P< 0.001).

Univariate analysis identified eight variables that were significantly associated with malignant IPMN:Presence of symptoms,presence of jaundice,MD- and mixed-type lesion,presence of mural nodule/solid component,size of mural nodule/solid component,main pancreatic duct (MPD) diameter,and elevated serum levels of CA19-9 and CEA.Multivariable logistic regression with these eight variables identified the presence of symptoms,MD and mixed-type lesion,size of mural nodule/solid component,MPD diameter,and elevated serum CA19-9 level as independent risk factors for malignancy in IPMN patients (Figure2).

In the discrimination analysis,the C-index of the PSC model was 0.842 [95%confidence interval (CI):0.782-0.901],which was the highest among the four examined models (Figure3).DeLong’s test revealed a significant difference in AUC between the PSC model and the JPS model (P< 0.001),the JHH model (P= 0.002),or the JPN-KOR model (P< 0.001).Calibration plots showed that the PSC model had the least pronounced departure from ideal predictions (Figure4).Of the remaining three models,the JPS and JHH models underestimated the probability of malignancy,while the JPN-KOR model overestimated BD-IPMN malignant potential.In the DCA,the PSC model showed a better clinical net benefit than the JPS and JHH models(Supplementary Figure2).The JPN-KOR model also resulted in a net benefit compared to the use of no risk stratification tool.

With regard to diagnostic tests,the ICG 2017 criteria resulted in a PPV of 0.603,NPV of 0.840,and diagnostic accuracy of 0.669 (Table2).Comparison of the four models revealed that the PSC model had a relatively high PPV (0.768),NPV (0.837),and accuracy (0.801).The JPN-KOR model had the highest NPV (0.907) but overestimated the malignancy of BD-IPMN (11 of 60 nonmalignant cases were misclassified as high-grade dysplasia or invasive carcinoma) with a PPV of 0.214,resulting in a diagnostic accuracy of 0.765.

DISCUSSION

This study represents the first head-to-head comparison of four models for preoperatively predicting the individual risk of malignancy in patients with IPMN.The results showed that compared with the other three models,the PSC model had the highest C-index,the least pronounced departure from ideal predictions,the highest net benefit,and the highest diagnostic accuracy.

Table1 Patient characteristics

External validation of a diagnostic model is important because it determines whether the diagnostic accuracy of the model can be reproduced in other cohorts.Our study provides external validation for the four examined models,and the results reveal fairly good performance in two models.The C-index and calibration plot of the PSC model were consistent with the results from the original training cohort.The C-index,PPV,NPV,and diagnostic accuracy of the JPS model were also similar to those from the primary cohort.Although the C-index of the JPN-KOR model in our cohort decreased,the diagnostic accuracy was comparable to the training set.However,due to the lack of internal validation and the nature of highly selected patients in the JHH model,the C-index decreased from 0.895 (original report) to 0.754 when the model was applied to our cohort.

Figure1 Study cohort.HGD:High-grade dysplasia; IGD:Intermediate-grade dysplasia; IPMN:Intraductal papillary mucinous neoplasm; LGD:Low-grade dysplasia; PDAC:Pancreatic ductal adenocarcinoma.

The current recommendations for the management of IPMN are mainly based on the ICG.The most recent version of ICG 2017 has minor revisions regarding the prediction of high-grade dysplasia and invasive carcinoma,surveillance,and postoperative follow-up of IPMN[5].Mural nodule size has been added as a predictor of malignancy.However,guideline-based patient management remains unsatisfactory.It helps clinicians and surgeons not to miss malignant disease (high NPV) but frequently leads to unnecessary surgery (low PPV)[16].Nomograms,as a predictive tool,provide an individualized risk score for a given patient,which facilitate the development of a more precise treatment strategy[17].Three of the four examined models showed an improved PPV compared with that of ICG 2017 (Table2),and all four models showed an improved overall diagnostic accuracy,supporting the utility of these models in clinical practice.

EUS is superior to computed tomography and magnetic resonance imaging in IPMN diagnosis,especially in identifying malignant characteristics such as mural nodules[18].EUS enables observation and measurement of mural nodules in a real-time fashion,and Doppler or contrast-enhanced harmonic EUS can demonstrate the presence of blood supply in mural nodules.In our cohort,mural nodules were present in more than half of the patients,and contrast-enhanced EUS was employed to differentiate mural nodules from mucin globs in approximately 20% of the patients.However,over one-third of the patients with mural nodules/solid components had nonmalignant disease (low and intermediate-grade dysplasia).Of these patients,17 had mural nodules ≥ 5 mm.Therefore,mural nodules per se may not always indicate malignancy,and a comprehensive analysis including contrast-enhanced EUS,confocal laser endomicroscopy,and tissue/cystic fluid aspiration-based molecular signatures(i.g.,KRAS/GNASmutations) may facilitate a more accurate preoperative determination of the grade of dysplasia.

In the current cohort,75% of resected MD-IPMN cases were found to be malignant,compared to 12% of BD-IPMN cases and 41% of mixed-type IPMN cases,similar to that in previous studies[16,19].One may argue that the majority of BD-IPMN patients,even with worrisome features,may be closely followed rather than immediately subjected to resection,since the 5-year disease-specific survival of these patients can reach 96.2%[20].Indeed,BD-IPMN had a much lower rate of malignant transformation within 5 years (4.3%) than MD-IPMN (48.0%)[21,22].Therefore,the JPS and JHH models that are applicable for any IPMN type may underestimate the overall risk of malignancy.This might be a reason why the accuracy of the JPS and JHH models was lower than that of the PSC model,which uses separate nomograms for MD-IPMN and BD-IPMN.

Elevated preoperative serum CA19-9 levels (>37 U/mL) were an independent risk factor for high-grade dysplasia and invasive carcinoma of IPMN in our cohort.An elevated CA19-9 level has also been demonstrated to be predictive of malignancy in many previous reports[23-25].In the four examined models,however,only the JPN-KOR model included serum CA19-9 level as a predictor.The cohort that was used to develop the PSC model indicated significance of elevated CA19-9 levels in univariate analysis[11].However,this factor was excluded from the final model due to the lack of preoperative CA19-9 levels in approximately 40% of the patients.From this perspective,the PSC model still has room for improvement.We have also constructed a nomogram based on our own data (Supplementary Figure3),with a C-index of 0.897 (95%CI:0.890-0.904) and bias-corrected estimates of 0.882 for bootstrapping and 0.875 for 10-fold cross-validation,although it needs further external validation.

Figure2 Associations between five risk factors and malignant intraductal papillary mucinous neoplasm in multivariate logistic regression analysis.BD:Branch duct; CA:Carbohydrate antigen; CI:Confidence interval; MD:Main duct; MPD:Main pancreatic duct.

One of the limitations of this study is that our cohort only included patients with surgically resected,pathologically confirmed IPMN,and whether the results of the comparison are applicable for unresected patients remains unknown.In addition,the current results of comparison were based on Asian patients,and the performance of each model may be better or worse under different clinical circumstances.

In conclusion,the PSC model exhibits the best performance characteristics in predicting the malignancy of pancreatic IPMN.Consequently,the PSC model should be considered the best tool in clinical practice for assessing an individual’s risk for malignant IPMN.

Table2 Comparison of the four models and the lnternational Consensus Guidelines 2017

Figure3 Comparison of the four models using receiver operating characteristic curves.AUC:Area under the curve; CI:Confidence interval; JHH:Johns Hopkins Hospital; JPN-KOR:Japan-Korea; JPS:Japan Pancreas Society; PSC:Pancreatic Surgery Consortium.

Figure4 Calibration plots of the observed vs predicted probability of malignancy in patients with intraductal papillary mucinous neoplasm.A:The PSC model; B:The JPS model; C:The JHH model; D:The JPN-KOR model.JHH:Johns Hopkins Hospital; JPN-KOR:Japan-Korea; JPS:Japan Pancreas Society; PSC:Pancreatic Surgery Consortium.

ARTICLE HIGHLIGHTS

Research background

Intraductal papillary mucinous neoplasm (IPMN) has the potential to become malignant.Thus,preoperative prediction of its malignancy is of vital importance to clinical practice.Currently,several models are available for predicting the malignancy of IPMN.However,whether these models can be widely applied in clinical practice remains unknown.This study aimed to externally validate these models and compare their accuracy in predicting the individualized probability of malignancy in patients with IPMN.The results may aid clinicians in assessing an individual’s risk for malignant IPMN.

Research motivation

To better facilitate clinicians’ evaluation of a patient’s risk-benefit profile for IPMN resection vs surveillance.

Research objectives

The aim of this study was to perform a head-to-head comparison of four models for predicting the malignancy of IPMN.The results may provide a reference for clinicians when evaluating the malignant potential of IPMN.

Research methods

Data of 181 patients with available preoperative endoscopic ultrasound records and pathologically confirmed IPMN were collected from a prospectively maintained institutional database over a 9-year period.Model comparison was assessed by using concordance index (Cindex),calibration plots,decision curve analyses,and diagnostic tests.

Research results

The C-index of the model from the Pancreatic Surgery Consortium (PSC) was 0.842 [95%confidence interval (CI):0.782-0.901],which was the highest among the four examined models.Calibration plots showed that the PSC model had the least pronounced departure from ideal predictions.In the decision curve analyses,the PSC model showed a better clinical net benefit than the three other models.Diagnostic tests also showed that the PSC model had the highest accuracy (0.801).

Research conclusions

The PSC model showed the best performance characteristics.Therefore,the PSC model should be considered the best tool for assessing an individual’s risk for malignant IPMN and may facilitate clinical decision-making regarding resectionvssurveillance.

Research perspectives

Future studies should focus on integrating CA19-9 into the PS C model and develop reliable biomarkers for predicting IPMN malignancy.

ACKNOWLEDGEMENTS

We thank Jia-Xin Zhou (Biomedical Informatics and Statistics Center,School of Public Health,Fudan University) for providing statistical support.