摘要: | 研究目的:本研究使用羅吉斯迴歸、支持向量機、決策樹、隨機森林、梯度提升及人工神經網路等人工智慧演算方法,預測可能導致乳癌復發的高危險因子,提供醫師或患者雙方更全面的參考資訊,降低患者再次復發的機會。
研究方法:以病歷回溯性世代研究方式,收集2000年至2021年臺北醫學大學之臨床研究資料庫中7006位乳癌患者相關資料,利用六種機器學習之方式將所蒐集之樣本進行隨機分組,建構「疾病預測模型」,納入人口學變項、臨床表徵及治療方式等相關就醫資?,以準確率、敏感度、特異度、精確率、F1-score及ROC等效能衡量指標,選擇最準確的演算法,來預測乳癌復發的可能危險因子。
研究結果:結果顯示,7006位乳癌患者中五年內復發為458人,復發率為6.54 %。在機器學習模型預測乳癌復發危險因子方面,外部驗證以「梯度提升」模型結果最佳,其準確率為0.77、敏感度為0.79、特異度為0.77、精確率為0.22、F1-score為0.34、AUC為0.82,具有良好的預測能力。根據最佳模型中所預測的變項相對重要性排序,「原癌症期別」為影響乳癌復發最重要的危險因子,其次為淋巴結狀態,血型則並列第三。醫療提供者可使用機器學習模型的預測結果作為參考,相信能幫助醫療提供者與病人在預後的治療或照護,改善患者的預後和生活品質,同時減少患者、家屬或照顧者心理層面的負擔。 Objective: This study employs artificial intelligence algorithms such as logistic regression, support vector machines, decision trees, random forests, gradient boosting, and artificial neural networks to predict high-risk factors associated with breast cancer recurrence. The aim is to provide comprehensive reference information for both physicians and patients, thereby reducing the chances of cancer recurrence.
Methods: Using a retrospective cohort study design, we collected relevant data from 7,006 breast cancer patients in the clinical research database of Taipei Medical University from 2000 to 2021. The collected samples were randomly divided into groups using six machine learning algorithms to construct a "disease prediction model." The model incorporated demographic variables, disease context, and treatment modalities as relevant medical data. Accuracy, sensitivity, specificity, precision, F1-score, and ROC were used as performance evaluation metrics to select the most accurate algorithm for predicting potential risk factors for breast cancer recurrence.
Results: The results showed that out of the 7,006 breast cancer patients, a total of 458 individuals experienced a recurrence within five years, resulting in a recurrence rate of 6.54%. Regarding the prediction of risk factors for breast cancer recurrence using the machine learning models, external validation revealed that the "gradient boosting" model performed the best. It achieved an accuracy of 0.77, sensitivity of 0.79, specificity of 0.77, precision of 0.22, F1-score of 0.34, and AUC of 0.82, demonstrating good predictive capability. According to the relative importance ranking of variables predicted by the best model, the "primary cancer stage" is the most crucial risk factor affecting breast cancer recurrence, followed by lymph node involvement. Blood type is tied for the third most important factor. Healthcare providers can utilize the predictive results of machine learning models as a reference, which is believed to assist them in making informed decisions regarding the prognosis, treatment, and care of patients. This can ultimately lead to improved patient outcomes and quality of life. Additionally, the utilization of these models can help reduce the psychological burden on patients, their families, and caregivers by providing valuable insights and guidance throughout the healthcare journey. |