摘要: |
目的 基于对策理论中求解Shapley值法构建在线性回归模型中当自变量存在多重共线性时求解自变量相对重要性的方法,同时提出自变量的序列重要性偏R2值的概念。方法 应用Shapley在1953年提出的对策理论中求解Shapley值法,以757例正常人的白细胞(WBC)、红细胞(RBC)、血小板(PLT)及红细胞压积(HCT)4项血常规指标作为自变量,分析这些指标对血红蛋白(HB)的相对重要性大小以及分析序列重要性偏R2值的实际意义。最后将估计结果和传统的指标及现推荐的指标进行比较。结果 最后进入回归模型的自变量有RBC、PLT和HCT,对HB的相对重要性估计值分别为0.355 3、0.012 4、0.553 8,用Shapley值法估计的自变量相对重要性值与现最为推荐的优势分析法的估计结果一致。自变量以不同的次序进入回归模型的序列重要性偏R2值不同。结论 HCT对HB的影响最大,其次是RBC,PLT影响较小,结果与相关性排序一致,说明用Shapley值法估计自变量的相对重要性具有合理性。 |
关键词: 线性模型 相对重要性 序列重要性偏R2值 对策理论 |
DOI:10.3724/SP.J.1008.2014.00865 |
投稿时间:2013-12-31修订日期:2014-02-21 |
基金项目:国家自然科学基金(81172771). |
|
Analysis and application of game theory in estimating variable importance in linear model |
JIA Xiao-xia1,2,WU Li-zhi3,YANG Wen2,SHEN Qi-jun1,2* |
(1. Department of Basic Medical Sciences, Zhejiang Pharmaceutical College, Ningbo 315000, Zhejiang, China; 2. Department of Preventive Medicine, Ningbo University School of Medicine, Ningbo 315000, Zhejiang, China; 3. Department of Environmental and Occupational Health, Zhejiang Center for Disease Control and Prevention, Hangzhou 310051, Zhejiang, China) |
Abstract: |
Objective To apply Shapley value analysis of the game theory for evaluating the relative importance of the predictors in the linear regression when colinearity exists, and to provide a new concept of sequential importance partial R2. Methods Shapley value analysis of game theory(proposed by Shapley in 1953) was used to evaluate the influencing factors of hemoglobin(HB) in 757 normal adults, by regressing HB on four predictors including the white blood cell(WBC), red blood cell(RBC), blood platelet(PLT) and hematocrit(HCT); meanwhile, the sequential importance partial R2 was used to analyze its practical significance. Finally the estimated results of Shapley value was compared with others measures including traditional methods and recommended method. Results A succinct set of predictors including RBC, PLT and HCT was identified for establishing a multiply regression, with their relative importance values being 0.355 3, 0.012 4 and 0.553 8, respectively. The results of relative importance were consistent between Shapley value and dominance analysis. Moreover, it was found that the partial R2 of predictors had different marginal contributions in different orders. Conclusion HCT has the largest contribution to HB, followed by RBC, and PLT has the least effect to HB. The order of contributions is consistent with the correlation matrix, indicating that the relative importance of the predictors in Shapley value is reasonable. |
Key words: linear regression relative importance sequential importance partial R2 game theory |