摘要: | 背景: 唐氏症為最常見的新生兒染色體異常,唐氏症患者常會有發展遲緩及先天性構造異常。第一孕期唐氏症篩檢使用生母體徵、胎兒超音波、生母血清指標等特徵,可測出胎兒唐氏症風險值。目前較少有相關文獻,探討使用不同之機器學習模型,去和第一孕期唐氏症篩檢原始預測模型比較,因此若是可以透過訓練機器學習模型,達到和原始預測模型類似的效果,將可優化整體篩檢流程。 方法: 使用台北長庚醫院之第一孕期唐氏症篩檢的資料,除原始資料集外,使用不同資料平衡演算法來進行組別不平衡之處理,再以五折驗證法,訓練機器學習模型,以原始第一孕期唐氏症預測模型之風險值為真實值,來評估機器學習成效。為增加機器學習模型之效率,對原有資料集進行特徵選取,選出五項重要特徵,即生母年紀、年紀對應之唐氏症背景值、頸部透明帶厚度、PAPP-A 和 Free beta HCG濃度,來和使用全部特徵之機器學習模型比較,並比較和原始第一孕期唐氏症預測模型之速度差異。最後,評估機器學習運行之邏輯性,使用機器自選特徵語法,讓機器學習模型選出最重要之五項特徵。 結果: 共有4061個案,在前處理後,有3812個案可供研究,其中高風險個案165位,低風險個案3647位。使用五折驗證之機器學習模型,在高風險個案中,最佳Recall為0.84,由Light GBM在random under sampling資料集中達成,而最佳之F1 Score為0.6,由LSTM在SVM-SMOTE資料集中達成,最佳之AUC為0.939,則由ANN及LSTM在random over sampling資料集中達成。使用特徵選取五項最重要特徵之模型,在高風險個案中最佳Recall為0.84,由SVM在random under sampling資料集中達成,而最佳之F1 Score為0.60,由ANN在SVM-SMOTE資料集中達成,最佳之AUC為0.929,則由ANN 在random over sampling資料集中達成。機器自選最重要之五項特徵,大部份機器學習模型(92.5%)都找出相同之五項特徵,即生母年齡、生母年齡對應之唐氏症背景值、胎兒頸部透明帶厚度、PAPP-A濃度,和free beta HCG濃度。使用特徵選取五項特徵之機器學習模型,輸入單一個案平均為10.4秒,而使用原始第一孕期唐氏症預測模型,輸入單一個案平均為109.1秒。 結論: 本研究證實,使用機器學習模型來預測唐氏症風險,可以達成和原始第一孕期唐氏症預測模型相似之成效,且可提昇運算速度約十倍,對於臨床工作之優化將有很大的幫助。
關鍵字: 機器學習、第一孕期唐氏症篩檢、深度學習、特徵選取 Background: Down syndrome is the most common chromosomal abnormality in newborns, often resulting in developmental delays and congenital structural anomalies. First-trimester Down syndrome screening utilizes maternal characteristics, fetal ultrasound, and maternal serum markers to predict the Down syndrome risk of the fetus. Currently, there is limited literature comparing the performance of different machine learning models to the original first-trimester Down syndrome screening models. Therefore, if machine learning models can be trained to achieve similar effectiveness to the original models, it could optimize the overall screening process. Methods: Using the first-trimester Down syndrome screening data from Taipei Chang Gung Memorial Hospital, various data balancing algorithms were applied to handle data imbalances, in addition to the original dataset. Machine learning models were then trained using five-fold cross-validation, with the risk values from the original first-trimester Down syndrome prediction model serving as the ground truth. To enhance the efficiency of the machine learning models, feature selection was performed on the original dataset, identifying five key features: maternal age, maternal age-related background risk for Down syndrome, fetal nuchal translucency thickness, serum PAPP-A, and free beta HCG levels. These selected five-feature models were compared with those machine learning models using all 17 features, and the speed differences were also compared with the original first-trimester Down syndrome prediction model. Finally, the logical consistency of the machine learning models was evaluated by allowing them to identify the most important five features using an automated feature selection algorithm. Results: Out of 4061 cases, 3812 cases were available for analysis after preprocessing, including 165 high-risk and 3647 low-risk cases. Among the machine learning models using five-fold cross-validation, the best recall of 0.84 in high-risk cases was achieved by LightGBM on the random under sampling dataset. The best F1 score of 0.6 was achieved by LSTM on the SVM-SMOTE dataset, and the highest AUC of 0.939 was achieved by both ANN and LSTM on the random over sampling dataset. Using the model with the five selected features, the best recall of 0.84 in high-risk cases was achieved by SVM on the random under sampling dataset, the best F1 score of 0.60 was achieved by ANN on the SVM-SMOTE dataset, and the highest AUC of 0.929 was achieved by ANN on the random over sampling dataset. The machine-selected five most important features were identified as the same five features by 92.5% of the machine learning models: maternal age, maternal age-related Down syndrome background risk, fetal nuchal translucency thickness, serum PAPP-A and free beta HCG levels. The average time to input a single case using the feature-selected machine learning model was 10.4 seconds, compared to 109.1 seconds using the original first-trimester Down syndrome prediction model. Conclusion: This study confirms that using machine learning models to predict the risk of Down syndrome can achieve similar effectiveness to the original first-trimester Down syndrome prediction model while increasing computational speed by approximately ten times, significantly benefiting the optimization of clinical workflows
Key words: machine learning, First trimester Down syndrome screening, deep learning, feature selection |