English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 45058/58234 (77%)
造訪人次 : 2225096      線上人數 : 326
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: http://libir.tmu.edu.tw/handle/987654321/62457


    題名: 利用自然語言處理及機器學習早期識別兒童生長障礙
    Early identification and diagnosis of growth disorders using NLP and machine learning
    作者: 程春燕
    CHENG, CHUN-YEN
    貢獻者: 醫學院人工智慧醫療碩士在職專班
    許明暉
    黎阮國慶
    關鍵詞: 生長障礙;生長矮小;青春期;生長曲線;兒科;人工智慧;電子醫療紀錄;機器學習;隨機森林;文字探勘;特徵選取;不平衡資料
    Growth Disorder;Short Stature;Puperty;Growth Curve;Pediatrics;Artificial Intelligence;Electronic Medical Record;Machine Learning;Random Forest;Feature selection;Imbalanced Data;Text Mining
    日期: 2022-06-23
    上傳時間: 2023-01-17 14:52:43 (UTC+8)
    摘要: 目的:生長異常是兒科醫生重視且關鍵的臨床狀況,研究兒童生長障礙的主要原因是確定可能威脅兒童未來健康的狀況。而兒童病理性的身材矮小發生率約5%,對於身材矮小應及時識別、診斷和適當治療,因此監測生長障礙在兒科醫療保健中至關重要。由於人工智慧在醫學影像及診斷上應用廣泛提供精準醫療輔助,而本研究目的利用機器學習協助初級保健醫師及早準確地診斷兒童生長障礙。
    方法:在本回顧性試驗研究中,通過臺北醫學大學臨床研究資料庫申請臨床試驗,使用其臨床研究數據庫的門診病童的臨床生長數據資料分析共112267筆資料(臺北醫學大學附設醫院的訓練測試集85743筆,及萬芳醫學中心的外部驗證集26514筆) 。應用Python及自然語言處理在電子病歷紀錄,進行文字探勘及資料前處理,並運用機器學習演算法評估生長障礙,比較多種機器學習模型分類器,包括決策數、K-近鄰演算法、隨機森林、邏輯斯迴歸、支持向量機、多層感知器機、自適應增強機、梯度提昇機和極端梯度提昇機,來預測初診追蹤一年病童的生長障礙。為了最佳預測模型,同時採用特徵選取和不平衡方法,來找到最佳特徵集以及平衡結果。此外,加入電子生長曲線表追蹤身高及體重的百分位、父母身高中值≧1SDS及≧2SDS標準差距、骨齡值與實際年齡≧1SDS及≧2SDS標準差距、生長速率≦5cm/年生長指標,來提高生長障礙診斷的準確性。
    結果:在前12次門診紀錄模組或混合特徵選取模組分析,訓練測試集或外部驗證集在機器模型隨機森林、梯度提昇機和極端梯度提昇機表現皆旗鼓相當且穩定。其中隨機森林在混合特徵選取模組,相對其他演算法運算快速,在身材矮小或性早熟分類診斷的驗證表現上:準確性0.88、靈敏度 0.91、特異性0.86、F值0.88、準確度0.89。另外在生長指標以骨齡≧2SDS標準差距、或目標身高≧2SDS標準差距或生長速率≦ 5公分/年的分類驗證表現更顯著優異:準確性0.90、靈敏度 0.92、特異性0.87、F值0.91、準確度0.89。
    討論:本研究使用不同的機器學習演算法,在兒童身長障礙分類診斷上具有穩定及極好效能,在上述所有演算法中,隨機森林是一項快速方便的精準醫療診斷的演算法。此外,在文字探勘藥物治療紀錄及疾病診斷資訊,與醫院結構化的ICD10診斷碼相符合度47.15%,與藥物相符合度86.03%,並且額外提取11.23%藥物資訊補足原醫院結構化的藥物欄位完整性,提供未來研究者參考。
    Objectives: The purpose of this study was to use machine learning to assist primary care physicians in the early and accurate diagnosis of childhood growth disorders.
    Methods: In this retrospective study, we recruited the clinical growth data of outpatients from the Taipei Medical University Clinical Research Database (TMUCRD). A total of 112267 subjects have been chosen and used in the study for further analysis. Text mining and data preprocessing have been applied to extract and clean the data from raw data. Subsequently, we implemented different machine learning algorithms to predict the growth disorders in outpatients after one year of follow-up. To find the optimal model, we assessed the performance of different models (i.e. Support Vector Machine, Multilayer perceptron, k-nearest neighbors, Decision Tree, Logistic regression, Random Forest, Adaptive Boosting, Gradient Boosting Machine and Extreme Gradient Boosting) using different measurement metrics. Feature selection and imbalance approaches are employed to find the optimal feature set as well as balance results. In addition, it is expected that the model will be drawn into an electronic growth chart to track the standard gap of target height ?1 SDS and ?2 SDS, skeletal age value and chronological age ?1 SDS and ?2 SDS, height percentile and weights percentile, and growth rate ?5 cm/year to improve the diagnosis of growth disorders.
    Results: In the first 12 records module or hybrid feature selection module analysis, the training test set or the external validation set performed equally and stable on the machine model Random Forest, Gradient Boosting Machine and Extreme Gradient Boosting. Among them, Random Forest algorithm is faster than the others in the hybrid feature selection method module, and the verification performance of short stature or precocious puberty diagnosis reached an accuracy of 0.88, sensitivity of 0.91, specificity of 0.86, F1-score of 0.88, and AUC of 0.89. In addition, the performance of the classification and verification of the growth index with the standard gap of bone age≧2 SDS, or the gap of target height≧2 SDS standard or growth rate≦5 cm/year is more significant and excellent with accuracy of 0.90, sensitivity of 0.92, specificity of 0.87, F1-score of 0.91, and AUC of 0.89.
    Conclusion: In this study, different machine learning algorithms have been implemented to reach a stable and excellent performance in the classification and diagnosis of children's growth disorders. Among all aforementioned algorithms, the Random Forest was a fast, convenient and accurate algorithm on precision medical diagnosis. In addition, in the text mining of medicine records and disease diagnosis information, the consistency with EMR structured ICD10 diagnosis columns were 47.15%, and the consistency with the medicine column was 86.03%, and an additional average of 11.23% of medicine information was extracted to supplement the original data. The completeness of the medicine column is provided for future researchers' reference.
    描述: 碩士
    指導教授:許明暉
    共同指導教授:黎阮國慶
    委員:張詠淳
    委員:陳中明
    委員:侯家瑋
    委員:許明暉
    委員:黎阮國慶
    資料類型: thesis
    顯示於類別:[人工智慧醫療碩士在職專班] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML265檢視/開啟


    在TMUIR中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    著作權聲明 Copyright Notice
    • 本平台之數位內容為臺北醫學大學所收錄之機構典藏,包含體系內各式學術著作及學術產出。秉持開放取用的精神,提供使用者進行資料檢索、下載與取用,惟仍請適度、合理地於合法範圍內使用本平台之內容,以尊重著作權人之權益。商業上之利用,請先取得著作權人之授權。

      The digital content on this platform is part of the Taipei Medical University Institutional Repository, featuring various academic works and outputs from the institution. It offers free access to academic research and public education for non-commercial use. Please use the content appropriately and within legal boundaries to respect copyright owners' rights. For commercial use, please obtain prior authorization from the copyright owner.

    • 瀏覽或使用本平台,視同使用者已完全接受並瞭解聲明中所有規範、中華民國相關法規、一切國際網路規定及使用慣例,並不得為任何不法目的使用TMUIR。

      By utilising the platform, users are deemed to have fully accepted and understood all the regulations set out in the statement, relevant laws of the Republic of China, all international internet regulations, and usage conventions. Furthermore, users must not use TMUIR for any illegal purposes.

    • 本平台盡力防止侵害著作權人之權益。若發現本平台之數位內容有侵害著作權人權益情事者,煩請權利人通知本平台維護人員([email protected]),將立即採取移除該數位著作等補救措施。

      TMUIR is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff([email protected]). We will remove the work from the repository.

    Back to Top
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋