 |
English
|
正體中文
|
简体中文
|
全文筆數/總筆數 : 45401/58577 (78%)
造訪人次 : 2510074
線上人數 : 292
|
|
|
資料載入中.....
|
請使用永久網址來引用或連結此文件:
http://libir.tmu.edu.tw/handle/987654321/62894
|
題名: | 使用隱向量表示法預測化學物質的毒性及其探討 Using Latent Representations to Predict the Toxicity of Chemicals and its Investigation |
作者: | 林暉倫 LIN, HUI-LUN |
貢獻者: | 大數據科技及管理研究所碩士班 許明暉 童俊維 |
關鍵詞: | 電腦輔助藥物設計;變分自動編碼器;生成模型;藥物開發 Computer-aided drug design;Variational Autoencoder;Generative models;Drug development |
日期: | 2023-07-10 |
上傳時間: | 2023-09-21 14:27:37 (UTC+8) |
摘要: | 在藥物開發方面,為了改善傳統藥物開發過程的效率,電腦輔助藥物設計被視為其中一種加速藥物開發的方法。然而,化學空間相當龐大,要找到符合所需物化特性的化合物也十分艱難,在過去十年中,深度學習技術已成功應用於各個研究領域,其中包含圖像、語音識別、自然語言處理以及藥物開發等領域。近年來基於深度學習技術的生成模型提供更有效的分子設計方法。採用變分自動編碼器 (Variational Autoencoder)來優化隱空間中的分子特性是可行的,因為隱空間是連續且可微分的;另外,還能透過隱空間的隨機抽樣生成符合所需物化特性的化合物。由於過去變分自動編碼器在訓練過程中僅考慮化合物的結構資訊,並未考量到化合物的生物活性,導致生成的化合物缺乏重要活性,因此我們提出了一種變分自動編碼器-線性分類器的生成模型,用於分子設計。在隱空間中專門加入一個線性分類器用於訓練生物活性相關指標;變分自動編碼器則訓練化合物的結構資訊。作為概念證明,訓練完模型的隱向量空間中具有預測化合物生物活性的能力;因此進一步驗證模型是否可用於特徵提取,以改善下游分類任務的表現。在訓練資料僅幾千筆下,模型仍達成有競爭的結果,未來模型在經過更多化合物訓練後,期待能成為化合物的特徵提取工具。 Computer-aided drug design is considered as one of the methods to accelerate the traditional drug development process and improve its efficiency. However, the chemical space is vast, and finding compounds that meet the desired physicochemical properties is extremely challenging. In the past decade, deep learning techniques have been successfully applied in various research areas, including image recognition, speech recognition, natural language processing, and drug development. In recent years, deep learning-based generative models have provided more effective methods for molecular design. Using a variational autoencoder (VAE) to optimize molecular properties in latent space is feasible because the latent space is continuous and differentiable. Furthermore, it enables the generation of compounds with the desired physicochemical properties through random sampling in the latent space. However, traditional VAE in the past only considered the structural information of compounds during the training process and did not consider their biological activity. This resulted in generated compounds lacking important bioactivity. Therefore, we propose a generative model that combines a VAE with a linear classifier for molecular design. In this model, a linear classifier is specifically introduced into the latent space to train activity-related indicators, while the VAE is trained on the structural information of compounds. As a proof of concept, the trained model demonstrates the ability to predict the biological activity of compounds in the latent vector space. Furthermore, we validate whether the model can be used for feature extraction to improve performance in downstream classification tasks. Even with a training dataset of only a few thousand examples, the model achieves competitive results. It is expected that with further training on a larger dataset of compounds, the model can become an efficient tool for feature extraction in compound analysis. |
描述: | 碩士 指導教授:許明暉 共同指導教授:童俊維 委員:許明暉 委員:童俊維 委員:張詠淳 委員:蘇家玉 委員:陳錦華 |
資料類型: | thesis |
顯示於類別: | [大數據科技及管理研究所] 博碩士論文
|
文件中的檔案:
檔案 |
描述 |
大小 | 格式 | 瀏覽次數 |
index.html | | 0Kb | HTML | 52 | 檢視/開啟 |
|
在TMUIR中所有的資料項目都受到原著作權保護.
|