摘要: | ABSTRACT Background: Traditionally, organizing pathology reports has been a manual task. This approach is labor-intensive, costly, and often leads to inaccuracies, complicating the analysis of data for medical research and delaying critical insights. While artificial intelligence (AI) offers potential solutions, many systems struggle to effectively handle the complexity of pathological texts. Numerous studies and articles have underscored the urgent need to automate the process of extracting information and structuring that information from free text pathology reports.
Objective: This study explores the use of generative AI to automate the extraction of information and structuring that information from 33 breast cancer free text pathology reports obtained from Taipei Medical University Hospital. The aim is to accelerate the information extraction process and enhance the reliability of this information for research and diagnostic purposes. Additionally, the study seeks if generative AI can help to improve the accuracy and efficiency to structure information from free text pathology reports.
Methods: For this study, a Streamlit web application was developed, leveraging its capabilities to efficiently create robust generative AI applications. This application was seamlessly integrated with the ChatGPT Large Language Model provided by OpenAI, tasked with the extraction and structuring of information from free-text pathology reports. The data, once extracted, is methodically organized and compiled into a downloadable Excel file for subsequent analysis. Furthermore, the application is designed to display the results on its web interface, offering immediate validation features. This allows users to promptly verify and assess the accuracy of the information processed. The systematic and meticulous approach adopted ensures the highest standards of data integrity and operational efficiency, essential for the reliability of research outcomes.
Results: The implementation of the Streamlit web application, integrated with the ChatGPT Large Language Model, successfully automated the extraction and structuring of information from 33 breast cancer free-text pathology reports. The system showed a higher percentage of accuracy, by achieving an extraction and structuring accuracy rate of 99.61%. This result confirms the effectiveness of generative AI in handling free text pathological reports.
Conclusion: The reliability of the system was further underscored by its ability to significantly reduce the time required to structure and analyze free-text pathological reports compared to traditional manual methods. This advancement highlights the potential of generative AI to transform the processing of free-text pathological reports, enhancing both efficiency and accuracy, and offering substantial improvements in research and clinical diagnostics. |