【打印本页】 【下载PDF全文】 【HTML】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 392次   下载 782 本文二维码信息
码上扫一扫!
利用机器学习算法构建浸润性乳腺癌预后模型:基于SEER数据库
陆春伟1,马骏*2
0
(1. 复旦大学附属中山医院中西医结合科, 上海 200032;
2. 复旦大学附属中山医院厦门分院中西医结合科, 厦门 361000
*通信作者)
摘要:
目的 利用机器学习算法分析浸润性乳腺癌预后的影响因素并构建预后模型。方法 采集美国监测、流行病学和终点事件(SEER)数据库中2010—2015年24 584例浸润性乳腺癌患者的临床和病理资料。利用单因素分析和logistic回归分析筛选预后变量,使用logistic回归、决策树、支持向量机、随机森林、人工神经网络5种机器学习分类算法建立生存预后的预测模型,评价各建模方法的预测能力,以灵敏度、特异度、准确度及ROC曲线的AUC作为模型的评价指标。结果 在21个模型输入变量中,组织分级、T分期、N分期、M分期、脑转移、人表皮生长因子受体2表达状态、手术治疗等因素对浸润性乳腺癌患者生存预后具有较大影响,5种机器学习算法构建的预后模型中随机森林和人工神经网络模型预测效果较好。结论 利用机器学习算法构建的浸润性乳腺癌预后模型的预测效果较好,可辅助医师判断浸润性乳腺癌患者的预后情况和治疗效果。
关键词:  SEER数据库  浸润性乳腺癌  机器学习  预后  预测模型
DOI:10.16781/j.CN31-2187/R.20230255
投稿时间:2023-05-07修订日期:2023-10-08
基金项目:
Construction of prognostic model for invasive breast cancer using machine learning algorithm: based on SEER database
LU Chunwei1,MA Jun*2
(1. Department of Integrative Medicine, Zhongshan Hospital, Fudan University, Shanghai 200032, China;
2. Department of Integrative Chinese and Western Medicine, Xiamen Branch of Zhongshan Hospital, Fudan University, Xiamen 361000, Fujian, China
*Corresponding author)
Abstract:
Objective To analyze the influencing factors of the prognosis of invasive breast cancer by using machine learning algorithms and construct prognostic model. Methods The clinical and pathological data of 24 584 patients with invasive breast cancer from 2010 to 2015 were collected from the Surveillance, Epidemiology, and End Results (SEER) database. Univariate analysis and logistic regression analysis were used to screen the prognostic variables. Five machine learning classification algorithms including logistic regression, decision tree, support vector machine, random forest and artificial neural network were used to establish the prediction model of survival prognosis. The prediction ability of each modeling method was evaluated. Sensitivity, specificity, accuracy and area under curve of receiver operating characteristic curve were used as evaluation indexes of the model. Results Among the 21 model input variables, histological grade, T stage, N stage, M stage, brain metastasis, expression status of human epidermal growth factor receptor 2 and surgical treatment had great impacts on the survival prognosis of patients with invasive breast cancer. Among the prognostic models constructed by 5 machine learning algorithms, random forest and artificial neural network models had better predictive effects. Conclusion The prognosis model of invasive breast cancer constructed by machine learning algorithm has good prediction effect, which can assist doctors to judge the prognosis and treatment effect of patients with invasive breast cancer.
Key words:  SEER database  invasive breast cancer  machine learning  prognosis  prediction model