摘要: | 腎病症候群 ( Nephrotic Syndrome,NS ) 是由一群會造成腎臟功能異常的疾病總稱,最常見的病理症狀如蛋白質由尿中流失、血液白蛋白降低等。目前腎病症候群雖可透過理學檢查、尿液常規檢查、血液常規檢查來做診斷,但往往仍然無法釐清病因。導致腎病症候群的原因很多,不同的病理變化會對應不同的治療處置,尚須以侵入性之腎臟切片進行確認。倘若有更即時性且非侵入性的評估方式,對於臨床診斷及治療會具有更大的價值。
本實驗使用臺北醫學大學附設醫院、臺北市立萬芳醫院、衛生福利部雙和醫院及馬偕紀念醫院共四間醫院資料庫,2011年01月至2021年06月診斷為腎病症候群之病患,其抽血及驗尿報告搭配國際疾病與相關健康問題統計分類第十版( ICD-10 ),進行腎臟切片分型的預測。使用K-NN Imputer將遺失值補值,並將data區分為80% Training Set、20% Test Set。Training set使用SMOTE處理不平衡資料,並以5-fold Cross-Validation訓練Random Forest、XGBoost、Logistic Regression、SVM及K-NN等5種模型,最終以Accuracy及AUC作為評估方式。
在ICD-10編碼N04.0腎病症候群伴有輕微腎絲球?常族群中,Random Forest之Accuracy 83.0%、AUC 0.697;N04.1腎病症候群伴有局部及節段性腎絲球病灶,Random Forest之Accuracy 88.4%、AUC 0.886;N04.2腎病症候群伴有瀰漫性膜性腎絲球腎炎,在Random Forest之Accuracy分別為86.8%、AUC 0.833。整體而言,使用Random Forest不論在哪個分類族群中,均可獲得不錯的表現,對於醫師及病人均可提供另一個診斷協助。 Nephrotic Syndrome (NS) is a collective term for a group of diseases that cause abnormal kidney function. The most common pathological symptoms include the loss of protein in urine and decreased blood albumin levels. Currently, NS can be diagnosed through physical examinations, urinalysis, and blood tests, but often the underlying cause remains unclear. There are many causes of Nephrotic Syndrome, and different pathological changes correspond to different treatment approaches, which often require invasive kidney biopsies for confirmation. Having a more real-time and non-invasive assessment method would greatly enhance the value of clinical diagnosis and treatment.
In this study, data from four hospitals, including Taipei Medical University Hospital, Wan Fang Hospital, Shuang Ho Hospital, and Mackay Memorial Hospital, were used. The data covered patients diagnosed with Nephrotic Syndrome from January 2011 to June 2021. Blood and urine reports were matched with the tenth edition of the International Statistical Classification of Diseases and Related Health Problems ( ICD-10 ) for predicting kidney biopsy findings. Missing data were imputed using K-NN Imputer, and the data were divided into an 80% training set and a 20% test set. The training set was processed using SMOTE to handle imbalanced data and trained using 5-fold cross-validation on five models: Random Forest, XGBoost, Logistic Regression, SVM, and K-NN. Accuracy and AUC were used as evaluation metrics.
In the subgroup of patients with ICD-10 code N04.0 for Nephrotic syndrome with minor glomerular abnormality, Random Forest achieved an accuracy of 83.0% and an AUC of 0.697. For N04.1, which represents Nephrotic syndrome with focal and segmental glomerular lesions, Random Forest achieved an accuracy of 88.4% and an AUC of 0.886. In the case of N04.2, indicating Nephrotic syndrome with diffuse membranous glomerulonephritis, the accuracies for Random Forest were 86.8%, with an AUC of 0.833. Overall, Random Forest demonstrated good performance across all classification subgroups, providing another diagnostic aid for physicians and patients. |