摘要: | 研究目的: 運用機器學習建立中風後 30 天內再住院的風險預測模型,並根據最佳模型結果,進行危險因子的重要性排序說明。 研究方法: 本研究以次級資料進行分析,蒐集 2012 年 1 月 1 日到 2021 年 12 月 31 日,來自臺北醫學大學三院臨床資料庫的中風住院患者,共 12,933 筆的數據。採羅吉斯回歸、SVM、決策樹、隨機森林、梯度提升機和人工神經網絡,共六種機器學習的演算法,開發用於預測中風出院後 30 天內再住院的風險預測模型。再根據測試組中操作者特徵曲線下面積(AUROC)來選擇最佳模型,進行特徵重要性排序,定義中風患者 30 天內再住院的危險因子。期許能夠幫助臨床醫生及早預測具有高風險再住院情形的中風患者,並使他們能在患者出院前製定符合需求的準備計劃。 研究結果: 在比較測試組中的六種機器學習的模型後,羅吉斯回歸具有最佳的表現能力,其 AUROC=0.6247,準確性為 0.6322,敏感度為 0.5377,陽性預測值為 0.1415。接著再依據相關係數觀察變數重要性,發現慢性阻塞性肺病(COPD)為影響中風再住院的最重要危險因子,其次為尿道炎及肺炎;影響較小則為血小板抑制劑和中風史。 結論: 使用羅吉斯回歸分析中風患者 30 天內再住院的危險因子,將會對這些患者的管理精確性和有效性有所幫助。對於患有 COPD 的中風患者,可以提早擬定符合需求的急性後期整合照護計畫(PAC),幫助患者應對疾病帶來的焦慮、抑鬱或其他心理壓力,這在健康結果的教育方面具有實際應用價值。 Introduction: A risk prediction model for readmission within 30 days after stroke was developed using machine learning techniques. Based on the results of the best-performing model, the importance of risk factors was ranked and explained. Method: This study conducted secondary data analysis using a dataset of 12,933 stroke inpatients from the clinical database of Taipei Medical University Hospital, spanning the time January 2012 to December 2021. Six machine learning algorithms, including logistic regression, support vector machine, decision tree, random forest, gradient boosting, and artificial neural network, were adopted to develop a risk prediction model for readmission within 30 days after stroke discharge. The best model was selected based on the area under the receiver operating characteristic (AUROC) in the testing set. Feature importance ranking was then performed to identify the risk factors associated with readmission within 30 days for stroke patients. The aim is to assist clinical physicians in early predicting stroke patients who are at high risk of readmission and enable them to develop tailored preparation plans before patient discharge. Result: After comparing the six machine learning models, logistic regression demonstrated the best performance with an AUROC of 0.6247, accuracy of 0.6322, and sensitivity of 0.5377. Furthermore, based on the observed correlation coefficients, the variable importance analysis revealed that chronic obstructive pulmonary disease (COPD) was the most crucial risk factor affecting readmission after stroke. Following COPD, urinary tract infection and pneumonia were identified as significant factors. Variables with relatively less impact included platelet inhibitors and a history of stroke. Conclusion: Utilizing logistic regression analysis to identify the risk factors for readmission within 30 days among stroke patients can potentially aid in improving the precision and effectiveness of their management. For stroke patients with COPD, early development of a tailored post-acute care (PAC) plan that meets their specific needs can assist them in coping with anxiety, depression, or other psychological pressures associated with their condition. This holds practical application value in terms of educating patients about health outcomes. |