摘要: | Lung cancer remains the leading cause of cancer-related deaths worldwide. Despite notable advancements in understanding, diagnosing, and treating lung cancer over the past two decades, challenges persist, with the five-year survival rate for lung cancer patients hovering around 20%, significantly lower than for many other types of cancer. To improve patient outcomes, further research into early detection, diagnosis, and targeted treatments is essential. Additionally, in the era of precision medicine, developing computational methods to tailor individual treatments represents a crucial frontier.
In this study, our work focuses on identifying new biomarkers and developing drug response prediction (DRP) models for lung cancer, which could aid in the personalized selection of therapies. Our study comprises two main parts:
Part 1: We aimed to discover potential biomarkers for lung cancer and proposed two novel candidates: ALDH2, a prospective stem-cell-related biomarker, and ABCG1, a promising epigenetic marker, using bioinformatics approaches. Our findings indicate a significant downregulation of both ALDH2 and ABCG1 genes and protein expression in lung cancer tissues. Moreover, we observed that the gene expression level of ALDH2 is closely correlated with a reduced overall survival time of cancer patients at an early stage, highlighting the prognostic value of ALDH2. ALDH2 activation was also found to be associated with stem cell-related pathways, suggesting its role as a stem-cell-related biomarker in lung cancer. For ABCG1, our results show the importance of DNA methylation changes in lung cancer, with alterations in DNA methylation status strongly linked to patient survival time, indicating its potential as an epigenetic marker.
Part 2: With the goal of constructing DRP models, we gathered drug sensitivity data for lung cancer cell lines and developed two models: a classified machine learning (ML) to predict drug sensitivity and a regression deep learning (DL) models to predict the IC50 value of drug-cell line interaction. To develop the ML model, we extracted drug SMILES features and combined them with biological features to advance model predictive capacity. Then, for algorithm choice, we tested seven common ML classifiers and found that Random Forest (RF) was the most effective for lung cancer data. Our optimized model, named RF-Lung-DR, achieved accuracies of 78% in lung adenocarcinoma (LUAD) and 80% in squamous cell carcinoma (LUSC) datasets. In addition, our result indicated that drug SMILES feature significantly contribute to the model’s performance. Then, leveraging this finding, we focus on these features to construct our DL. Combining with the success of graph neural network models in addressing DRP issues in cancer research, we developed MLG2Net, a Molecular Local Global Graph Network based on graph neural networks. MLG2Net was constructed with two branches: one representing the drug SMILES through a local global graph network, and the other illustrating biological makers via a map. MLG2Net demonstrated high efficiency, with a Pearson correlation coefficient (CCp) of 0.8616 on the LUAD dataset and 0.7999 on the LUSC dataset. Compared to reference graph models, MLG2Net showed superior performance, highlighting the potential of DL in identifying effective therapeutics for lung cancer patients through high-throughput screening of cell line-drug interactions using available drug sensitivity data.
In conclusion, advancing personalized cancer treatment necessitates the discovery of novel biomarkers and the development of effective DRP models to identify efficacious therapeutics. Our research proposes ALDH2 and ABCG1 as potential biomarkers in lung cancer, showing prognostic value and targeting therapy via downregulation in lung cancer tissues and a correlation with patient’s overall survival time. Additionally, we have developed the RF-Lung-DR and MLG2Net models to predict drug responses in lung cancer cell lines, potentially facilitating therapeutic screening. This work effectively contributes to the future of lung cancer management. Lung cancer remains the leading cause of cancer-related deaths worldwide. Despite notable advancements in understanding, diagnosing, and treating lung cancer over the past two decades, challenges persist, with the five-year survival rate for lung cancer patients hovering around 20%, significantly lower than for many other types of cancer. To improve patient outcomes, further research into early detection, diagnosis, and targeted treatments is essential. Additionally, in the era of precision medicine, developing computational methods to tailor individual treatments represents a crucial frontier.
In this study, our work focuses on identifying new biomarkers and developing drug response prediction (DRP) models for lung cancer, which could aid in the personalized selection of therapies. Our study comprises two main parts:
Part 1: We aimed to discover potential biomarkers for lung cancer and proposed two novel candidates: ALDH2, a prospective stem-cell-related biomarker, and ABCG1, a promising epigenetic marker, using bioinformatics approaches. Our findings indicate a significant downregulation of both ALDH2 and ABCG1 genes and protein expression in lung cancer tissues. Moreover, we observed that the gene expression level of ALDH2 is closely correlated with a reduced overall survival time of cancer patients at an early stage, highlighting the prognostic value of ALDH2. ALDH2 activation was also found to be associated with stem cell-related pathways, suggesting its role as a stem-cell-related biomarker in lung cancer. For ABCG1, our results show the importance of DNA methylation changes in lung cancer, with alterations in DNA methylation status strongly linked to patient survival time, indicating its potential as an epigenetic marker.
Part 2: With the goal of constructing DRP models, we gathered drug sensitivity data for lung cancer cell lines and developed two models: a classified machine learning (ML) to predict drug sensitivity and a regression deep learning (DL) models to predict the IC50 value of drug-cell line interaction. To develop the ML model, we extracted drug SMILES features and combined them with biological features to advance model predictive capacity. Then, for algorithm choice, we tested seven common ML classifiers and found that Random Forest (RF) was the most effective for lung cancer data. Our optimized model, named RF-Lung-DR, achieved accuracies of 78% in lung adenocarcinoma (LUAD) and 80% in squamous cell carcinoma (LUSC) datasets. In addition, our result indicated that drug SMILES feature significantly contribute to the model’s performance. Then, leveraging this finding, we focus on these features to construct our DL. Combining with the success of graph neural network models in addressing DRP issues in cancer research, we developed MLG2Net, a Molecular Local Global Graph Network based on graph neural networks. MLG2Net was constructed with two branches: one representing the drug SMILES through a local global graph network, and the other illustrating biological makers via a map. MLG2Net demonstrated high efficiency, with a Pearson correlation coefficient (CCp) of 0.8616 on the LUAD dataset and 0.7999 on the LUSC dataset. Compared to reference graph models, MLG2Net showed superior performance, highlighting the potential of DL in identifying effective therapeutics for lung cancer patients through high-throughput screening of cell line-drug interactions using available drug sensitivity data.
In conclusion, advancing personalized cancer treatment necessitates the discovery of novel biomarkers and the development of effective DRP models to identify efficacious therapeutics. Our research proposes ALDH2 and ABCG1 as potential biomarkers in lung cancer, showing prognostic value and targeting therapy via downregulation in lung cancer tissues and a correlation with patient’s overall survival time. Additionally, we have developed the RF-Lung-DR and MLG2Net models to predict drug responses in lung cancer cell lines, potentially facilitating therapeutic screening. This work effectively contributes to the future of lung cancer management. |