Personal predictive model based on systemic inflammation markers for estimation of postoperative pancreatic fistula following pancreaticoduodenectomy
2022-10-11ZhiDaLongChaoLuXiGangXiaBoChenZhiXiangXingLeiBiePengZhouZhongLinMaRuiWang
Zhi-Da Long, Chao Lu, Xi-Gang Xia, Bo Chen, Zhi-Xiang Xing, Lei Bie, Peng Zhou, Zhong-Lin Ma, Rui Wang
Zhi-Da Long, Chao Lu, Xi-Gang Xia, Bo Chen, Zhi-Xiang Xing, Lei Bie, Peng Zhou, Rui Wang, Department of Ηepatobiliary and Pancreaticosplenic Surgery, Jingzhou Ηospital, Yangtze University, Jingzhou 434020, Ηubei Province, China
Zhong-Lin Ma, Department of Ηepatobiliary Surgery, Lu’an Ηospital of AnΗui Medical University, Ηefei 237006, Anhui Province, China
Abstract BACKGROUND Postoperative pancreatic fistula (PF) is a serious life-threatening complication after pancreaticoduodenectomy (PD). Our research aimed to develop a machine learning (ML)-aided model for PF risk stratification.AIM To develop an ML-aided model for PF risk stratification.METHODS We retrospectively collected 618 patients who underwent PD from two tertiary medical centers between January 2012 and August 2021. We used an ML algorithm to build predictive models, and subject prediction index, that is, decision curve analysis, area under operating characteristic curve (AUC) and clinical impact curve to assess the predictive efficiency of each model.RESULTS A total of 29 variables were used to build the ML predictive model. Among them, the best predictive model was random forest classifier (RFC), the AUC was [0.897, 95% confidence interval (CI): 0.370-1.424], while the AUC of the artificial neural network, eXtreme gradient boosting, support vector machine, and decision tree were between 0.726 (95%CI: 0.191-1.261) and 0.882 (95%CI: 0.321-1.443).CONCLUSION Fluctuating serological inflammatory markers and prognostic nutritional index can be used to predict postoperative PF.
Key Words: Pancreatoduodenectomy; Pancreatic fistula; Machine learning algorithm; Systemic inflammatory biomarker; Risk prediction
lNTRODUCTlON
Pancreaticoduodenectomy (PD), also known as a Whipple procedure, is one of the most difficult and complex surgeries that carries a high rate of major complications[1]. Post-operative pancreatic fistula (PF), as one of the most difficult complications after PD, can seriously endanger the lives of patients, so it has become a field of continuous concern for pancreatic surgeons[1,2]. Although the safety of PD has improved significantly in the past three decades[3,4]. Alarmingly, previous prospective studies have reported that postoperative PF occupied an incidence of > 10%[5-7].
In recent years, people have studied different styles of surgery and perioperative attempts to reduce the incidence of postoperative PF. However, regardless of the type of surgery, PF is still the most common fatal complication after pancreatectomy. Understanding the potential complications and early warning of these complications is important for the care of these severe patients.
Previous studies have utilized preoperative radiology and clinical variables combined with specific intraoperative factors to predict the risk of postoperative PF[8-11]. Despite advances in predictive platforms for postoperative PF, they have undergone a constantly changing approach. However, because of its unsatisfactory predictive performance, an improved delivery system is deemed necessary. Therefore, exploring an optimal risk score range model may contribute to eliminating potential lifethreatening complications, and stratifying patients with postoperative PF risk, which can be better applied to clinical management.
Nowadays, a series of serum markers suggest that detecting systemic inflammation may be associated with the risk of benign and malignant disease progression[12-14]. At the same time, the systemic reaction stimulated by local inflammation is closely related to the complications after gastrointestinal surgery[15,16]. In addition, machine learning (ML) algorithms have been widely used in the field of medicine. These unceasing new algorithms and iterative analyses might be useful for prognostication in cases and optimize individual treatment decisions[17]. Collectively, this combination has facilitated elevated predictive performance while minimizing the prediction error.
Given this situation, we searched for the help of inflammatory factors and ML-based algorithms to optimize the predictive accuracy for postoperative PF. In this study, we tried to identify alternative predictors independently related to postoperative PF and develop an optimal risk stratification model that can accurately identify high-risk patients with postoperative PF.
MATERlALS AND METHODS
Patients selection
Patients who underwent PD to treat various periampullary tumors from two tertiary medical centers (Jingzhou Hospital and Lu’an Hospital of Anhui Medical University) between January 2012 and August 2021 were retrospectively reviewed. The inclusion criteria were: (1) Resected tumor specimens were confirmed to be malignant by pathological examination; (2) Blood routine examination and liver function examination results were found within 3 d before surgery; and (3) The patient had complete case data and relevant indicators of imaging, pathology and laboratory examination. The exclusion criteria were: (1) Patients receiving preoperative treatment, such as thermal ablation, neoadjuvant chemotherapy or radiotherapy; (2) Severe respiratory and circulatory diseases; (3) Severe acute cholangitis or infection in other parts of the body before surgery; (4) Metastasis from other parts of the primary tumor or direct invasion of adjacent organs from the primary tumor; and (5) Parathyroid diseases or other factors interfering with abnormal changes of procalcitonin (PCT). This study was a retrospective cohort study, which was approved by the Ethics Committee of Jingzhou Central Hospital (Reference: 2021-JH005) and conformed to the Declaration of Helsinki. Because this study adopted anonymous follow-up, the patients’ personal privacy information was strictly confidential. The detailed research flow chart is shown in Figure 1.
Figure 1 The flow chart. PD: Pancreatoduodenectomy.
Diagnostic criteria for postoperative PF
According to the standards defined by the International Study Group for Pancreatic Fistula (ISGPF) in 2016, that is, drainage flow > 30 mL for ≥ 72 h after an operation, the amylase content of the drainage fluid is measured. If it exceeds ≥ 3 times the upper limit of normal and had a clinical impact (such as abdominal pain or fever) and needed clinical treatment, it is judged that PF has occurred. The grade of PF updated by ISGPF in 2016 removes the diagnosis of grade A PF. The increase in amylase in asymptomatic drainage fluid is considered biochemical leakage,i.e., non-real PF. The occurrence of significant clinical symptoms based on biochemical leakage and the change of treatment strategy (such as puncture and drainage, interventional hemostasis, indwelling abdominal drainage tube for > 3 wk, infection,etc.) is defined as grade B PF. If grade B PF needs surgical treatment, or is complicated with organ failure or even death, the grade of PF increases to grade C. Therefore, grades B and C PF are also known as clinical postoperative PF[18,19].
Blood sample collection
We chose to collect 3-5 mL blood samples from each patient on an empty stomach in the morning of 3 d before the operation, and included the latest blood routine and liver function tests in this study. Peripheral venous blood was taken in the morning of d 1, 3 and 5 after the operation, and the changes in C-reactive protein (CRP), serum PCT, and white blood cells were continuously observed.
Data collection and quality assessment
We obtained population baseline data and clinical pathological data from the patients’ medical records. For instance, the pancreatic texture was evaluated by the surgeon during the operation (soft 1, hard 0), and the diameter of the main pancreatic was obtained by computed tomography or magnetic resonance imaging before the operation. We also collected routine laboratory measurement results, and when the missing value was ≥ 10% of the bias of the total variable, the variable was directly discarded and not included in the final model variable screening[20]. Finally, a total of 29 variables that met the inclusion criteria were used to build ML-based models.
Construction and verification of ML-based models
At the beginning of building the model, we randomly divided the population data into two parts, namely, the training queue and the verification queue. The training queue was used to construct the predictive model, and the validation queue was used as the internal validation of the model to evaluate the robustness of the model. When screening candidate variables, we adopted the “two-step segmentation evaluation”, that is, the principle of random sorting to obtain the intersection[21]. In short, by sorting the intersection of variable sets, the optimal subset modeling was obtained. Finally, these models were evaluated through inspection, discrimination and calibration.
Statistical analysis
As for descriptive variables (i.e.continuous or classified variables), the median (interquartile range) or frequency (percentage) were used for statistical analysis. Theχ2test or Mann-Whitney test was used to calculate the variables between groups to evaluate whether there was a statistical difference. Stepwise regression based on the minimum value of the Akaike information standard was used to select the variables. All data analysis was completed with the help of R language software (version 4.0.4, http://www.r-project.org/). AllPvalues were double tailed, andP< 0.05 was statistically significant.
RESULTS
Clinicopathological baseline characteristics of patients
In this study, all patients were randomly divided into a training set (n= 432, 70%) and validation set (n= 186, 30%)viathe caret package. Seventy-eight (18.06%) and 20 (10.75%) patients developed postoperative PF in the training and validation group, respectively, as shown in Table 1. There were 76 (12.3%) grade B and 22 (3.6%) grace C. One patient died of multiple organ failure due to drug-resistant bacterial infection; five underwent reoperation because of continuous blood drainageviathe drainage tube, which was confirmed to be abdominal bleeding caused by intraoperative PF; and two were transferred to intensive care.
Table 1 Baseline demographic and clinicopathological characteristics of patients
POPF: Postoperative pancreatic fistula; IQR: Inter-quartile range; BMI: Body mass index; ASA: American Society of Anesthesiologists; CRP: C-reactive protein; WBC: White blood cell; PCT: Procalcitonin; AGR: Albumin-to-globulin ratio; PNI: Prognostic nutrition index; NLR: Neutrophil-to-lymphocyte ratio; NAR: Neutrophil-to-albumin ratio; PLR: Platelet-to-lymphocyte ratio; LMR: Lymphocyte-to-monocyte ratio; HALP: Hemoglobin level × albumin level × lymphocyte count/platelet count ratio.
Selection of candidate variables
Feature selection is a universal problem in ML[22]. We performed an iterative analysis of 29 potential candidate variables, and the correlation matrix showed that there was a significant correlation between postoperative PF and inflammatory factors and some clinical variables (Figure 2A), including CRP, PCT, neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), and hemoglobin level × albumin level × lymphocyte count/platelet count ratio (HALP). As shown in Figure 2B, HALP, PCT, neutrophil-to-albumin ratio (NAR), PLR and PNI were the top important predictors. Meanwhile, the seven top-ranked predictors were HALP, remnant texture, PCT, NAR, PLR, PNI, and body mass index (BMI).
Figure 2 Variable filtering and weight allocation. A: Correlation matrix analysis; B: Weight distribution of the candidate variables. BMI: Body mass index; ASA: American Society of Anesthesiologists; CRP: C-reactive protein; WBC: White blood cell; PCT: Procalcitonin; AGR: Albumin-to-globulin ratio; PNI: Prognostic nutrition index; NLR: Neutrophil-to-lymphocyte ratio; NAR: Neutrophil-to-albumin ratio; PLR: Platelet-to-lymphocyte ratio; LMR: Lymphocyte-to-monocyte ratio; HALP: Hemoglobin level × albumin level × lymphocyte count/platelet count ratio; RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; XGboost: Extreme gradient boosting.
Construction of PF predictive model based on ML algorithm
In the training queue, each patient could use positive or negative training and output the final judgment results. For example, a random forest classifier (RFC) algorithm could be used to effectively navigate the free parameter space to obtain a robust model (Figure 3A). The variable Gini index in the RFC model is shown in Supplementary Table 1. In addition, data mining through the decision tree (DT) model was useful, as shown in Figure 3B, among the candidate variables related to inflammatory factors, PCT and BMI also played an important role in DT as branch weight, which could be used as an important predictor of postoperative PF. The artificial neural network (ANN) model also showed relatively robust predictive performance, but slightly lower than that of RFC (Figure 4). We also constructed nomographs, which depended on the parameters obtained by LR, as shown in Supplementary Table 2. Compared with traditional predictive models, inflammatory factors also accounted for an important proportion.
Figure 3 Visualization of predictive model based on machine learning algorithm. A: Random forest classifier model; B: Decision tree (DT) model. The candidate factors associated with postoperative pancreatic fistula were ordered via RFC algorithm (A) and (B) prediction node and weight were allocated via DT algorithm. BMI: Body mass index; ASA: American Society of Anesthesiologists; CRP: C-reactive protein; WBC: White blood cell; PCT: Procalcitonin; AGR: Albuminto-globulin ratio; PNI: Prognostic nutrition index; NLR: Neutrophil-to-lymphocyte ratio; NAR: Neutrophil-to-albumin ratio; PLR: Platelet-to-lymphocyte ratio; LMR: Lymphocyte-to-monocyte ratio; HALP: Hemoglobin level × albumin level × lymphocyte count/platelet count ratio; RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network.
Figure 4 Visualization of predictive model based on artificial neural network algorithm. A: Artificial neural network model; B: Variable importance using connection weight. BMI: Body mass index; PCT: Procalcitonin; PNI: Prognostic nutrition index; NAR: Neutrophil-to-albumin ratio; PLR: Platelet-to-lymphocyte ratio; HALP: Hemoglobin level × albumin level × lymphocyte count/platelet count ratio.
Comparison between ML-based models
To explore the effectiveness of five supervised learning models for postoperative PF evaluation, we used decision curve analysis (DCA) for evaluation, which was consistent with the results of the included candidate variables. Even if different predictive models included the same variables, there were certain differences in their predictive effectiveness, as shown in Figure 5. In addition, as shown in Table 2, the predictive efficiency of RFC was the best [0.897, 95% confidence interval (CI): 0.370-1.424] compared with the other four predictive models, followed by ANN (0.882, 95%CI: 0.321-1.443), DT (0.807, 95%CI: 0.250-1.364), extreme gradient boosting (XGboost) (0.793, 95%CI: 0.270-1.316), and support vector machine (SVM) (0.726, 95%CI: 0.191-1.261). In conclusion, the iterative algorithm analysis using supervised learning, RFC and ANN, as well as DT (ML-aided decision support) models were properly used to guide postoperative PF prediction.
Figure 5 Efficiency evaluation of machine learning-based prediction model. A: Decision curve analysis (DCA) of training set; B: DCA of testing set. SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; RFC: Random forest classifier; XGboost: Extreme gradient boosting.
Table 2 The operating characteristic curve analyses for each machine learning-based model
Internal validation of the optimal postoperative PF predictive model
We evaluated the clinical predictive efficiency of the optimal prediction model (RFC), as shown in Supplementary Figure 1. RFC can be used to achieve accurate stratification of patients’ postoperative PFviaclinical impact curve (CIC). In general, RFC performed best in the construction of prediction models by fusing inflammatory markers.
DlSCUSSlON
Our study revealed two major findings. First, accurate risk stratification of postoperative PF in patients who received PD, which mainly depended on the added value of systemic inflammation markers. Second, the ML-based predictive model is better than the traditional predictive algorithm model, which is suitable for identifying whether patients have postoperative PF.
Several risk factors leading to such complications have been reported in the relevant literature, including pancreas texture, BMI, intraoperative blood loss, blood transfusion, and operating time[9,23,24]. We summarize updated literature on predicting postoperative PF, in combination with various candidate predictive markers in Supplementary Table 3. Guoet al[25] reported that the texture of pancreas, size of the main pancreatic duct, portal vein invasion and confirmed pathology are the risk factors of postoperative PF. Tajimaet al[26] summarized that preoperative imaging evaluation of pancreatic pathologies would be also beneficial for stratifying. Not surprisingly, systemic inflammatory markers such as neutrophils, lymphocytes, platelets, CRP, albumin, and biomarkers may help predict postoperative PF. The systemic response to postoperative local inflammatory stimulation is tightly related to the complications after gastrointestinal surgery[27]. Gasteigeret al[15] reported that postoperative pancreatitis and inflammatory reaction are the main determinants of postoperative PF[15]. Intriguingly, our calculated risk factors for postoperative PF and inflammatory factors accounted for an irreplaceable weight in the predictive model.
In this study, an attempt was made to improve early postoperative risk stratification by combining local pancreatic residual inflammatory status and systemic response. We found that abnormal HALP, PCT, NAR, PLR and PNI showed reliable predictive value for postoperative PF. Previous studies have confirmed that neutrophils, as the source of vascular endothelial growth factor and tissue inhibitor protease, can promote tumor infiltration and distant metastasis[28-30]. Additionally, the number of lymphocytes in cancer patients changes frequently, which seriously affects the prognosis and survival rate[31,32]. As noted above, it appears that inflammatory factors were highly related to the presence of postoperative PF. Combined with these findings, our analysis showed that systemic inflammatory markers are of value in predicting postoperative PF.
Our ML-based model was based on clinical parameters and laboratory test results, which were consistent with previous research results. Clinical indicators including preoperative serum albumin, lipase level, and amount of intraoperative fluid infusion were independent risk factors of postoperative PF[23,24,33]. Therefore, we further analyzed the accuracy of the predictive model constructed between clinical parameters and systemic inflammatory markers based on an ML-based algorithm. Not surprisingly, we found that systemic inflammatory markers accounted for a high weight in each model. Among these predictive models, RFC allowed the calculation of risk level based on candidate variables, so the best predictive efficiency was obtained. It is not surprising that RFC adopted the resampling technique of bootstrapping to repeatedly focus on the “bagging” procedure[34]. To detect the discrimination of the ML-based model, the DCA and CIC methods were used to evaluate the predictive performance, and the results were consistent with the expected goal. Taken together, our model may apply to patients who intended to receive PD, especially to help surgeons decide whether to prevent postoperative PF after surgery.
Despite several strengths, there were some noteworthy limitations to this study. First, patients included were from two tertiary referral hospitals, which may have resulted in selection bias. Second, although we have established a perfect predictive model through an ML-based algorithm, our model still needs to be confirmed in other hospital settings. Although we adopted internal data crossvalidation, we still need more external data to verify its feasibility in the future. Third, we only adopted simple data obtained from classification, missing clinical data were not considered throughout the study. Hence, incorporating specific new technologies such as immunodiagnostic biomarkers may help to improve the accuracy of predictive models.
CONCLUSlON
Our results provide new insights into candidate predictive markers associated with high risk of PF. With the help of HALP, NAR, CRP, PCT and PLR, we developed ML-based predictive models, and the performance of these unsupervised integrated models was superior to that of traditional predictive models. We expect these findings to extend research to strengthen clinical decision-making and guide treatment.
ARTlCLE HlGHLlGHTS
Research background
We provide insights into the candidate predictive markers associated with a high risk of postoperative pancreatic fistula (PF) via serum inflammatory secretion. With the help of hemoglobin level × albumin level × lymphocyte count/platelet count ratio, neutrophil-to-albumin ratio, C-reactive protein, procalcitonin and platelet-to-lymphocyte ratio, we develop machine learning (ML)-based predictive models,and the predictive performance of these unsupervised integrated models was superior to that of traditional predictive models. We expect these findings to extend research to strengthen clinical decision-making and guide treatment.
Research motivation
Fluctuating serological inflammation markers and prognostic nutritional index can be detected in the early postoperative period, and clinically well established to predict postoperative PF; in particular,random forest classifier (RFC) performed best, which can guide optimal treatment, clinical management and prevent or mitigate adverse consequences.
Research objectives
A total of 29 variables were used to build the ML predictive model. Among them, the best predictive model was RFC, the area under the curve (AUC) was [0.897, 95% confidence interval (CI): 0.370-1.424],while the AUC of the artificial neural network, eXtreme gradient boosting, support vector machine, and decision tree were between 0.726 (95%CI: 0.191-1.261) and 0.882 (95%CI: 0.321-1.443).
Research methods
As for descriptive variables (i.e., continuous or classified variables), the median (interquartile range) or frequency (percentage) were used for statistics in this study. The χ2 test or Mann-Whitney test was used to calculate the variables between groups to evaluate whether there was a statistical difference. Stepwise regression based on the minimum value of the Akaike information standard was used to select the variables. All data analysis was completed with the help of R language software (version 4.0.4,http://www.r-project.org/). All P values were double tailed, and P < 0.05 was statistically significant.
Research results
A total of 29 variables were used to build the ML predictive model. Among them, the best predictive model was RFC, the area under the curve (AUC) was [0.897, 95% confidence interval (CI): 0.370-1.424],while the AUC of the artificial neural network, eXtreme gradient boosting, support vector machine, and decision tree were between 0.726 (95%CI: 0.191-1.261) and 0.882 (95%CI: 0.321-1.443).
Research conclusions
Fluctuating serological inflammatory markers and prognostic nutritional index (PNI) can be detected in the early postoperative period, which has been clinically proved to predict postoperative PF. In particular, RFC performed best, which can guide optimal treatment, clinical management, and prevent or mitigate adverse consequences.
Research perspectives
PD, also known as a Whipple procedure, is one of the most difficult and complex surgeries that carries a high rate of major complications. Postoperative PF, as one of the most difficult complications after PD, can seriously endanger the lives of patients, so it has become an area of continuous concern for pancreatic surgeons. Although the safety of PD has improved significantly in the past three decades, previous prospective studies have reported that postoperative PF has an incidence of > 10%. Understanding the potential complications and early warning of these complications is important for the care of these patients.
ACKNOWLEDGEMENTS
The authors thank all medical workers and patients involved in this study, including those involved in data collection and compilation.
FOOTNOTES
Author contributions:Long ZD, Lu C, Xia XG, Chen B, Xing ZX, Bie L, Zhou P, Ma ZL, and Wang R designed the research study; Long ZD, Lu C, Xia XG, and Chen B performed the research; Xia XG, Chen B, and Xing ZX contributed new reagents and analytic tools; Long ZD, Lu C, Xia XG, Chen B, Xing ZX, Bie L, Zhou P, Ma ZL, and Wang R analyzed the data and wrote the manuscript; all authors have read and approve the final manuscript.
lnstitutional review board statement:This retrospective study was following the declaration of Helsinki, and was ethically reviewed and approved by the Institutional Ethics Committee of Jingzhou Hospital, No. 2021-JH005.
lnformed consent statement:Since the patient information contained in this study was anonymous, written informed consent was not obtained from all participants.
Conflict-of-interest statement:All authors declare that there is no conflict of interest.
Data sharing statement:No additional data are available.
Open-Access:This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BYNC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is noncommercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Country/Territory of origin:China
ORClD number:Zhi-Da Long 0000-0002-6956-2567; Chao Lu 0000-0002-6236-1550; Xi-Gang Xia 0000-0002-4573-2387; Bo Chen 0000-0002-2017-1235; Zhi-Xiang Xing 0000-0002-1789-0078; Lei Bie 0000-0002-1078-2022; Peng Zhou 0000-0002-1456-1378; Zhong-Lin Ma 0000-0002-1287-0987; Rui Wang 0000-0002-6992-2287.
S-Editor:Chen YL
L-Editor:Kerr C
P-Editor:Chen YL
杂志排行
World Journal of Gastrointestinal Surgery的其它文章
- Oncologic aspects of the decision-making process for surgical approach for colorectal liver metastases progressing during chemotherapy
- Research progress on the immune microenvironment of the gallbladder in patients with cholesterol gallstones
- Central pancreatectomy for benign or low-grade malignant pancreatic tumors in the neck and body of the pancreas
- lrinotecan- vs oxaliplatin-based regimens for neoadjuvant chemotherapy in colorectal liver metastasis patients: A retrospective study
- Predictors of difficult endoscopic resection of submucosal tumors originating from the muscularis propria layer at the esophagogastric junction
- Liver transplantation with simultaneous splenectomy increases risk of cancer development and mortality in hepatocellular carcinoma patients