【打印本页】 【下载PDF全文】 【HTML】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 1092次   下载 766 本文二维码信息
码上扫一扫!
基于机器学习算法的重症缺血性脑卒中早期死亡预测效果评价
罗枭,程义,何倩,涂博祥,吴骋*,贺佳*
0
(海军军医大学(第二军医大学)卫生勤务学系军队卫生统计学教研室, 上海 200433
*通信作者)
摘要:
目的 评价支持向量机(SVM)、随机森林、极限梯度提升(XGBoost)3种机器学习算法与logistic回归模型在重症缺血性脑卒中30 d死亡结局预测中的效果。方法 使用2008年至2019年美国重症监护医学信息数据库Ⅳ(MIMIC-Ⅳ)中符合纳入标准的2 358例重症缺血性脑卒中患者资料,分别用SVM、随机森林、XGBoost 3种机器学习算法与logistic回归方法,结合合成少数过采样技术(SMOTE)建立早期死亡预测模型,并使用ROC曲线的AUC值、准确度、F1分数、布里尔分数等指标评价模型的预测效果。结果 SVM、随机森林、XGBoost与logistic回归模型在原始不平衡数据集中预测早期死亡的AUC值分别为0.78、0.81、0.84、0.83。应用SMOTE合成数据集后,SVM、随机森林、XGBoost与logistic回归模型的AUC值分别为0.72、0.84、0.83、0.83。除SVM模型外,随机森林、XGBoost模型与logistic回归之间有相似的预测能力,但其准确度、布里尔分数均优于logistic回归模型,综合分类性能更优。结论 机器学习算法在缺血性脑卒中早期死亡预测中性能较传统logistic回归方法更优。
关键词:  重症缺血性脑卒中  早期死亡预测  机器学习  合成少数过采样技术
DOI:10.16781/j.CN31-2187/R.20220608
投稿时间:2022-07-20修订日期:2022-09-02
基金项目:军队双重学科建设项目-03,上海市公共卫生体系建设三年行动计划学科带头人计划(GWV-10.2-XD05),上海市公共卫生体系建设三年行动计划学科建设项目(GWV-10.1-XK05).
Prediction of early mortality of severe ischemic stroke patients based on machine learning algorithms
LUO Xiao,CHENG Yi,HE Qian,TU Bo-xiang,WU Cheng*,HE Jia*
(Department of Military Health Statistics, Faculty of Health Services, Naval Medical University(Second Military Medical University), Shanghai 200433, China
*Corresponding authors)
Abstract:
Objective To evaluate the effects of 3 machine learning algorithms (support vector machine [SVM], random forest, and extreme gradient boosting [XGBoost]) and logistic regression in predicting the 30-d mortality of severe ischemic stroke patients. Methods The data of 2 358 patients with severe ischemic stroke who qualified for the criteria in the Medical Information Mart for Intensive CareⅣ (MIMIC-Ⅳ) database from 2008 to 2019 were used. SVM, random forest, XGBoost and logistic regression combined with synthetic minority oversampling technique (SMOTE) were used respectively to build early mortality prediction models. The prediction performance of models was evaluated by the area under curve (AUC) of receiver operating characteristic curve, accuracy, F1-score, and Brier score. Results The AUC values of SVM, random forest, XGBoost and logistic regression models using original unbalance data were 0.78, 0.81, 0.84 and 0.83, respectively. After using SMOTE-based synthetic data, the AUC values of SVM, random forest, XGBoost and logistic regression models were 0.72, 0.84, 0.83 and 0.83, respectively. Except for SVM, random forest and XGBoost had similar predictive ability to logistic regression, but their accuracy and Brier score were better than logistic regression, and their overall classification performance was better. Conclusion Machine learning algorithms have better performance than traditional logistic regression in predicting early mortality of ischemic stroke patients.
Key words:  severe ischemic stroke  early mortality prediction  machine learning  synthetic minority oversampling technique