Jorge Baquerizo-Burgos ID· Maria Egas-Izquierdo ID· Doménica Cunto ID· Carlos Robles-Medranda ID
Endsocopy Division, Instituto Ecuatoriano de Enfermedades Digestivas (IECED), Guayaquil, Ecuador.
Acta Gastroenterol Latinoam 2023;53(3):226-240
Received: 09/08/2023 / Accepted: 14/09/2023 / Published
online 30/09/2023 /
https://doi.org/10.52787/agl.v53i3.339
Artificial intelligence is a field of science and engineering that focuses on the computational understanding of intelligent behaviors and the creation of artifacts that exhibit such behaviors, enabling computers to function and think like humans. This technology assists in overcoming the multiple challenges faced by healthcare professionals and contributes to the diagnosis, management, and prognosis of patients. Currently, several artificial intelligence models have been developed for digestive endoscopy, including models that allow the detection of anatomical structures that can assist in the training of physicians, serve as a guide during endoscopic procedures, or assist in stratifying premalignant and malignant lesions. This would reduce false negatives and provide more timely treatments. Computerized systems for lesion detection and diagnosis exist for different segments of the digestive tract, each with specific functions that provide assistance during procedures. All of this has been aimed at reducing risks stemming from human and environmental factors, among others, which can affect the diagnosis and management of diseases. Artificial intelligence models for digestive endoscopy can not only enhance the visual impression of endoscopists but also reduce the learning curve through the application of precise technologies. In this way, the gap between experienced and less experienced endoscopists is reduced. In this article, the technological advancements of artificial intelligence in digestive endoscopy and related future aspects are discussed.
Keywords. Artificial intelligence, computer-assisted detection, computer-assisted diagnosis, deep learning, endoscopy.
La inteligencia artificial es un campo de la ciencia e ingeniería que se ocupa de la comprensión computacional de comportamientos inteligentes y la creación de artefactos que exhiben tales comportamientos, lo que permite a las computadoras funcionar y pensar de manera similar a la de los seres humanos. Esta tecnología ayuda a superar los múltiples retos que enfrentan los profesionales de la salud aportando favorablemente en el diagnóstico, manejo y pronóstico de los pacientes. Actualmente se han desarrollado varios modelos a nivel de endoscopia digestiva, incluyendo algunos que permiten la detección de estructuras anatómicas que pueden ayudar en el entrenamiento de médicos, como guía durante procedimientos endoscópicos o para la estratificación de lesiones pre-malignas y malignas, disminuyendo falsos negativos y proporcionar tratamientos oportunos. En la actualidad existen sistemas computarizados de detección de lesiones y de diagnóstico en los distintos segmentos de la vía digestiva, cada uno con funciones particulares que proporcionan asistencia durante los procedimientos. Todo esto se ha llevado a cabo con el fin de reducir riesgos derivados por factores humanos, ambientales, entre otros, los cuales pueden afectar el diagnóstico y manejo de enfermedades. Los modelos de inteligencia artificial para endoscopía digestiva pueden, además de mejorar la impresión visual de los endoscopistas, disminuir la curva de aprendizaje a través de la aplicación de tecnologías precisas, y de esta manera reducir la diferencia entre endoscopistas expertos y menos expertos. En este artículo se discuten los avances tecnológicos de la inteligencia artificial en endoscopia digestiva y los aspectos futuros relacionados.
Palabras claves. Inteligencia artificial, detección asistida por computadora, diagnóstico asistido por computadora, aprendizaje profundo, endoscopia.
Abbreviations
AI: Artificial Intelligence.
ML: Machine
learning.
DL: Deep learning.
CADe:
Computer-assisted detection device.
CADx: Computer-assisted diagnostic
device.
EUS: Endoscopic ultrasound.
EGD:
Esophagogastroduodenoscopy.
AUC: Area under the
curve.
BE: Barrett’s Esophagus.
SCEC: Squamous
cell esophageal carcinoma.
NBI: Narrow Band
Imaging.
GERD: Gastroesophageal reflux
disease.
SVM: Support Vector Machine.
CNN:
Convolutional Neural Network.
CRC: Colorectal
cancer.
ADR: Adenoma Detection Rate.
RR: Relative
risk.
APC: Adenoma per Colonoscopy.
PDR: Polyp
Detection Rate.
ERCP: Endoscopic retrograde
cholangiopancreatography.
PPV: Positive Predictive
Value.
NPV: Negative Predictive Value.
mAP: Mean
Average Precision.
FPS: Frames per Second.
IoU:
Intersection over the union.
CT: Computerized
tomography.
MRI: Magnetic resonance imaging.
IPMN:
Intraductal papillary mucinous neoplasm.
SEL: Subepithelial
Lesion.
GIST: Gastrointestinal Stromal Tumor.
NET:
Neuroendocrine Tumor.
Artificial Intelligence: Basic Concepts
Artificial intelligence (AI) is a branch of computer science whose purpose is the understanding and execution of intelligent insights from a set of computational models.1 Using a set of algorithms, AI is capable of functioning and reasoning like a human being through a learning process based on training and has the advantage of being able to complete it in less time than a human being.1 Additionally, this technology can incorporate machine learning (ML) and its subset, deep learning (DL).2
ML is a subgroup of AI characterized by the use of mathematical models for learning from data, which later enables pattern recognition.3 Predictive models are created from algorithms, allowing for data analysis and the resolution of complex problems. Additionally, ML can be categorized into three types: supervised, unsupervised, and reinforcement2,4,5:
a) Supervised learning: this type of learning is based on training using well-categorized or labeled data (external supervision). Labeled data is divided for both training and internal validation. Supervised learning is based on regression, classification, and characterization.2,3
b) Unsupervised learning: this model learns from uncategorized data, enabling the algorithm to operate without any guidance, relying on the understanding of patterns and thus requiring a greater amount of information.2,3
c) Reinforcement learning: it does not require data or supervision to learn; instead, it is based on learning from the environment through rewards.2,3
DL is a specialized category of ML that is based on the architecture of neural networks resembling those in the human brain.3 It consists of an initial layer that receives input; this layer if followed by a set of hidden middle layers, and then the final output layer (Figure 1). Each layer in this network comprises a group of neurons or nodes that transform (activate) an input into an output through mathematical functions.2 The output of a previous layer serves as the input for the next layer, and so on, until reaching the output layer to obtain a final outcome or detection.2,3
Figure 1. Schematic representation of the architecture of convolutional neural
network models
The development of a detection model based on DL involves three main phases. In the first phase, data (images or videos) is collected, and the structures to be used in training the model are properly labeled. Next, in the second phase, the model’s architecture is established, and neural networks are created (input layer, middle layer, and output layer). Finally, in the third phase, the samples obtained in the previous phases are used to train the model and subsequently validate it internally (Figure 2).1 The metrics for evaluating the model’s performance are obtained from this final phase (Table 1).
Figure 2. Phases of deep learning model development
Table 1. Metrics obtained for the performance evaluation of deep learning
models
Clinical Applications of Artificial Intelligence
Clinical applications of AI have progressively increased in the field of healthcare, including gastroenterology. AI helps overcome the numerous challenges faced by healthcare professionals during data acquisition, analysis, and knowledge application that contributes to patient diagnosis, management, and prognosis.1 Additionally, automation in image identification and recognition assists in reducing errors stemming from human factors (fatigue and workload, among others).
Technological advancements have led to the development of intelligent systems that facilitate the detection or the stratification of lesions observed during endoscopic or imaging procedures. These are referred to as computer-assisted detection device (CADe) or computer-assisted diagnostic device (CADx).6 Thus, the application of these devices in digestive endoscopy can facilitate and increase lesion detection during procedures and categorize lesions as benign or malignant in real-time.
Artificial intelligence in Digestive Endoscopy
Currently, several models have been developed for upper and lower gastrointestinal endoscopy, as well as for advanced endoscopic procedures such as cholangioscopy and endoscopic ultrasound (EUS) (Table 2). These models encompass both CADe and CADx systems, each employing distinct algorithms that enable different functionalities. These functionalities include identifying anatomical structures and specific lesions, assisting in physician training or serving a guide during endoscopic procedures, reducing the number of false negatives through characterization and stratification of premalignant and malignant lesions, among others.4
Table 2. Applications of artificial intelligence in different segments of the
digestive system
In the following sections, we will review updated information on the uses of AI and its impact according to the type of endoscopic assessment.
Upper Digestive Endoscopy
Also known as esophagogastroduodenoscopy (EGD), this procedure is of great importance in the diagnosis of upper gastrointestinal tract lesions.7,8 However, the diagnostic rate varies according to the performance of the endoscopist.9 Errors during EGD are one of the leading causes of incorrect diagnosis of premalignant lesions and severe esophagogastricduodenal diseases. AI systems have been developed to overcome the technical challenges described above. Their use in the upper digestive tract ranges from anatomical localization to the detection and evaluation of malignant and premalignant lesions.7-10
Takiyama et al. developed an AI model capable of classifying anatomical structures in the upper digestive tract, which has shown excellent performance in identifying the larynx (AUC 1.00), the esophagus (AUC 1.00), the stomach (upper, middle, and lower portions), and the duodenum (AUC 0.99).7
On the other hand, in the multicenter study conducted by Luo et al., the diagnostic accuracy of the GRADIS model for detecting upper digestive tract neoplasms (esophagus and stomach) was evaluated. This model achieved a diagnostic accuracy of 95.5% (95% CI: 95.2 - 95.7) during internal validation. When comparing its performance with endoscopists of different expertise levels, it showed similar sensitivity to experts (94.2% vs. 94.5%), and higher sensitivity compared to competent endoscopists (94.2% vs. 85.8%) and trainees (94.2% vs. 72.2%).11 Additionally, the diagnostic accuracy of experts (92.8%) when using the AI model was similar to that of the group of competent endoscopists (93.4%) and trainees (90.4%).11 This demonstrates that the application of AI can narrow the gap between experts and non-experts.11
Esophagus
Accuracy in the early diagnosis of Barrett’s esophagus (BE) and esophageal neoplasia remains a challenge, even for many experienced endoscopists. Once BE is identified, the identification of regions with dysplasia or early adenocarcinoma becomes necessary.
AI models have been designed to assist endoscopists in improving the accuracy of diagnosing these lesions,12 including systems for neoplasia classification using real-time magnification with high precision (89.9%), which have enabled early diagnosis and differentiation of neoplasia in BE.13,14
The CADx system developed and validated by de Groof et al. allowed for the classification of neoplastic and non-neoplastic images in BE compared to the performance of 53 endoscopists. The model outperformed the endoscopists’ performance and achieved higher accuracy (88.0% vs. 73.0%), sensitivity (93.0% vs. 72.0%), and specificity (83.0% vs. 74.0%).15
On the other hand, due to the significant importance of recognizing and treating esophageal carcinoma promptly, researchers have developed systems that enable lesion detection as well as the assessment of disease invasion.12,16,17 Esophageal carcinoma is often detected at advanced stages, and small lesions are usually detected by highly experienced endoscopists.12 AI allows for the detection of lesions smaller than 10 mm with high accuracy (91.4%), even surpassing that of many so-called expert endoscopists (> 15 years of experience, 88.8%), those with moderate experience (5 - 15 years, 81.6%), and those with limited experience (< 5 years, 77.2%).16
Determining the depth of the lesion enables the selection of the appropriate treatment (surgical, endoscopic, pharmacological), and prognosis assessment.17 One of the models with high diagnostic accuracy in predicting the depth of invasion of squamous cell esophageal carcinoma (SCEC) is proposed by Tokai et al. Researchers used 1751 images for training and 291 images for validation, achieving a sensitivity of 84.1% and a diagnostic accuracy of 80.9% in estimating the depth of SCEC invasion. When compared to thirteen endoscopists, this model showed higher diagnostic accuracy and a greater AUC.17
Stomach
Stomach cancer usually shows nonspecific symptoms during its early stages, and that´s why patients are often diagnosed at advanced stages. The prognosis of stomach cancer depends on the assessment of the depth of the lesion and its early detection. It has been reported that the early detection of stomach cancer can increase 5-year survival rates to 90.0%.18
According to Menon et al., the rate of false negatives in the diagnosis of early stomach cancer can reach up to 25.0%.18 Automation systems aim to reduce this percentage with models that classify stomach images in EGD to monitor blind spots with high precision,19 models that detect lesions suggestive of stomach cancer11 and precancerous lesions,20 and that further assess the depth of invasion.21
Chromoendoscopy is one of the diagnostic methods used for the early detection of gastric neoplasm. However, during an endoscopic session, multiple video frames can be generated, making the review process an exhaustive task for endoscopists.
To prevent losses during evaluation, Ali et al. developed a CADx trained to classify frames as normal or pathological based on local and global texture. This model showed a sensitivity, specificity, precision, and AUC of 91.0%, 82.0%, 87.0%, and 0.91, respectively.22 The model proved to be a diagnostic aid in the early detection of gastric cancer since it reduces the time used for the evaluation of endoscopic sequence.
The model studied by Wu et al. 19 showed high accuracy, specificity, and sensitivity (92.5%, 94.0%, 91.0%, respectively) in assessing non-malignancy, and surpassing expert endoscopists in this task.19 Additionally, during real-time procedures, it showed excellent performance in early gastric adenocarcinoma detection with blind-spot monitoring. Additionally, other AI models have shown high accuracy in diagnosing gastrointestinal neoplasia comparable to that of expert endoscopists.11,17
Furthermore, models have been developed that not only identify neoplastic lesions but also predict their depth. Nagao et al. trained a model to predict the depth of invasion of gastric cancer using conventional white-light images, narrow band imaging (NBI) images, and images with indigo carmine contrast.21 The model showed high accuracy across all three systems (white-light images 94.5%, NBI 94.3%, and indigo carmine 95.5%).21
Zhu et al. (total accuracy 89.2%) and Yoon et al. (sensitivity 81.7%. specificity 75.4%) have reported diagnostic accuracy of their models for assessing invasion depth, which is comparable to that of other conventional methods. The advantages of using these models lie in the more objective assessment of macroscopic lesion characteristics, reducing the need for other invasive techniques like EUS.21,23,24
In addition to neoplasia identification, other applications include the detection of gastroesophageal reflux disease (GERD)25 and H. pylori-associated gastritis.26 Models developed to assist in classifying GERD with NBI have achieved a total diagnostic accuracy of 99.2% for grade A-B lesions, 100% for grade C-D lesions, and 100% in the control group. Therefore, these models are considered highly useful for assisting in the automatic detection of lesions consistent with GERD, and increase diagnostic accuracy for trainees.25
On the other hand, the neural network designed for predicting H. pylori in endoscopic images correctly diagnosed 80% of negative cases, 84% of eradicated cases, and 48% of positive cases. The authors of this study emphasize the utility of this model in identifying patients who may require additional confirmation testing for H. pylori based on endoscopic results, and recommend its use as a diagnostic aid.26
Small Intestine
Capsule endoscopy, as a non-invasive procedure allows for the detection and classification of lesions (bleeding, ulcers, and polyps), assessment of intestinal motility, and evaluation of conditions like celiac disease and other pathologies that primarily affect the small intestine. However, the evaluation of the large number of obtained images (> 60 000) and the difficulty in directing the capsule (completely dependent on gastrointestinal peristalsis) make the procedure slow (from 45 minutes to 8 hours) and tedious. To address the technical difficulties associated with this procedure, automation using AI has been explored.28
AI models for capsule endoscopy have been developed based on DL. Image classification and categorization are performed using support vector machines (SVMs). These models separate data using hyperplanes in two or more dimensions. After employing Kernel parameters, an “optimal” hyperplane is determined, which creates “boundaries” for data categorization (Figure 3).27,28 Following categorization, DL algorithms are used to create artificial neural networks.27,29
Figure 3. Support vector machines (SVM) for data classification
Currently, AI models for capsule endoscopy include capsule tracking, detection of polyps, bleeding, ulcers, and the study of small intestine-specific pathologies like celiac disease and Crohn’s disease.26
In the gastrointestinal tract, models incorporated in capsule endoscopy usually allow for its tracking and localization in different segments (mouth, stomach, small intestine, and colon), after excluding frames with “noise” (feces, bubbles, etc.), with a sensitivity and specificity of 88.0%.30 Evaluating images based on their topographic location saves study time and enhances diagnostic accuracy.30
In the multicenter study published by Ding et al.,30 a convolutional neural network (CNN) was developed for the identification of normal images, inflammation, ulcers, polyps, lymphangiectasia, bleeding, vascular disease, diverticula, and parasites, among others. When comparing the model’s results with those of participating gastroenterologists, a sensitivity of 98.9% (95% CI: 99.7 - 99.9) was obtained vs. 74.6% (95% CI: 73.1 - 76.0) in the identification of abnormalities per patient, and a sensitivity of 99.9% (IC 95%: 99.6 – 99.9) vs. 76.9% (IC 95%: 75.6 – 78.2) in lesion analysis. Moreover, the reading time per patient was significantly shorter in the CNN group compared to gastroenterologists (5.9 ± 2.23 minutes vs. 96.6 ± 22.53 minutes, p < 0.001). Based on these results, the researchers concluded that the application of AI in capsule endoscopy is an important tool to assist gastroenterologists in analyzing images captured with this device more efficiently and accurately.31
Models for real-time bleeding detection achieve diagnostic accuracy of up to 99.0%.32-34 The model developed by Aoki et al. detected gastrointestinal bleeding with a sensitivity, specificity, and accuracy of 96.6%, 99.9%, and 99.9%, respectively.31 Other models developed aid in the stratification and prediction of the risk of recurrent bleeding in order to provide timely treatment and avoid unnecessary endoscopies.35,36
For the detection of ulcers and erosions, Wang et al.´s model achieved a diagnostic accuracy of 92.1%.37 For the identification of tumors, the diagnostic accuracy starts at 86.0%, with a sensitivity ranging from 88.0% to 97.0%, and specificity between 81.0% and 96.0%.38
In a recent meta-analysis of CNN in capsule endoscopy, pooled sensitivity and specificity were obtained as follows: 96.0% (95% CI: 91.0 - 98.0) and 97.0% (95% CI: 93.0 - 99.0) in the detection of ulcers and erosion; 97.0% (95% CI: 93.0 - 99.0) and 100% (95% CI: 99.0 - 100) in the identification of gastrointestinal bleeding; and 97.0% (95% CI: 82.0 - 99.0) and 98.0% (95% CI: 92.0 - 99.0) in the detection of cancer and polyps.37
Furthermore, models developed for the identification of inflammatory bowel disease using capsule endoscopy currently achieve high levels of accuracy (83.3% to 90.8%),39 and allow the recognition of hidden disease patterns as well.4
Compared to current endoscopes which provide high-quality images, the image quality of capsule endoscopy is poor.27 However, AI models for capsule endoscopy have the advantage of having a robust databases fed with a large number of images useful for the creation of CNN.
Colon
Colorectal cancer (CRC) is currently considered one of the leading causes of cancer-related death in both men and women.40 For the early identification of premalignant lesions (polyps and adenomas), colonoscopy remains an essential procedure. However, according to the literature, approximately 25% of these lesions can be undetected, even by expert hands.41 The undetected premalignant lesions increase the risk of developing CRC.
Automated systems have been developed for the detection and characterization of polyps. The first systems developed used a limited number of images and consequently had a poor diagnostic accuracy (72.0%).42 Subsequently, new models have been trained with higher accuracy (> 95.0%) using a larger number of images. This makes it possible to better evaluate polyps and tiny adenomas (≤ 5 mm), and to predict CRC prognosis, patient survival, and distant invasion.42,43
A meta-analysis evaluated the performance of CADe systems in the detection of colorectal neoplasia.44 The authors found a higher adenoma detection rate (ADR) in the groups that used CADe compared to their control groups (36.6% vs. 25.2%, relative risks (RR): 1.44; 95% CI: 1.27 – 1.62; p < 0.001). Additionally, the detection of adenomas during colonoscopy (APC) was superior in the CADe groups compared to the control group (50.3% vs. 34.6%; (RR): 1.70; 95% CI: 1.53 - 1.89; p < 0.001). The authors did not find a significant difference in colonoscopy efficiency (withdrawal time) between the group that used CADe and the control group.44
Robles-Medranda et al. studied the efficiency of AI-assisted colonoscopy for the detection of polyps and adenomas during CCR screening studies (Figure 4).45 The results obtained by the authors were compared according to the level of experience of the endoscopists (experts vs. non-experts). With AI assistance during endoscopic procedures, an increase in the ADR and polyp detection rate (PDR) was achieved from 16.5% to 18.2% and from 50.4% to 60.0%, respectively. Depending on the experience level, the increase in ADR was evident in the junior group (10.8% to 16.2%), which approached the level of the experts.45
Figure 4. Detection of polypoid lesion in the colon using a computer-aided detection
device (CADe) (AITROL, mdconsgroup, Guayaquil, Ecuador)
Furthermore, CADx systems have been designed with the ability to immediately characterize polyps using imaging technologies beyond white-light endoscopy and magnified NBI, such as confocal endomicroscopy. In this field, a model developed by Sánchez-Montes et al. for predicting the histological classification of polyps achieved a diagnostic accuracy, sensitivity, and specificity of 91.1%, 92.3%, and 89.2%, respectively.46
Additionally, features such as lesion depression, fold convergence, and irregular and heterogeneous capillary pattern are associated with deep invasion of premalignant lesions. Current CADe models for the identification and detection of the above characteristics sound “attractive” to determine the type of treatment to be performed (e.g., endoscopic mucosal resection).4
Despite the availability of diagnostic methods such as cholangioscopy, endoscopic retrograde cholangiopancreatography (ERCP), or EUS for studying the biliopancreatic system, there are difficulties in differentiating lesions and discrepancies among evaluators.
Cholangioscopy
As cholangioscopy is a relatively new advanced endoscopic technique without established training guidelines, the visual impressions among operators is highly variable.47 As a result, several classifications have been created to detect malignancy based on the macroscopic characteristics of biliary lesions during the procedure,48-51 with the intention of reducing this variability among observers. However, these classifications have not achieved that goal.47
Recently, AI models have been developed to assist operators in detecting malignant lesions and obtaining biopsies. The first AI models in cholangioscopy were developed for the detection of tortuous blood vessels, but had the disadvantage of being limited to still images and were not usable in live cases.52,53 Another limitation of these models is their lack of clinical validation. However, these models achieved quite high internal validation metrics. In their first study, Mascarenhas et al. developed a model using 6475 cholangioscopy images obtained from 85 patients. During validation using frames, they obtained a sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of 99.3%, 99.4%, 99.6%, and 98.7%, respectively.53 Later, they conducted a new study, in which they doubled the number of images obtained from the same number of patients (from 6475 to 11855 images); this time, they assessed the model’s accuracy in distinguishing between benign and malignant lesions. The metrics obtained were diagnostic accuracy (94.8%), sensitivity (94.7%), specificity (92.1%), PPV (94.8%), and NPV (84.2%).52 It is important to note that, even though the model achieved excellent internal validation parameters, these should not be extrapolated to its rea-worldl utility in live clinical cases.
Subsequently, two studies with real-time AI models were conducted. The first one, conducted by Marya et al., evaluated the clinical application of the CNN model and compared it to biopsy and cytology results. The study found that the CNN model achieved higher sensitivity (93.3%), specificity (88.2%), and accuracy (90.6%) compared to biopsies (sensitivity 35.7%, specificity 100%, and accuracy 60.9%, respectively) and cytology (sensitivity 40.0%, specificity 100%, and accuracy 62.5%, respectively).54 One limitation of the study was that only one operator was responsible for annotating around 2 million images, which could lead to errors due to fatigue or bias.
Robles-Medranda et al. developed their own CNN model capable of detecting neoplastic lesions in pre-recorded and real-time videos (Figure 5). Following the AI model implementation stages (data collection, annotation, and model design), internal validation was performed, and then a clinical comparison between the model and endoscopists (experts and non-experts) was done.55 This multicenter study was conducted in two phases. The first phase involved the development and validation of the AI model, named AIWorks-Cholangioscopy (mdconsgroup, Guayaquil, Ecuador). The initial version of this model was developed using 81080 images from 23 patients, and achieved a mean average precision (mAP) of 0.298, an F1-score of 0.280, an intersection over the union (IoU) of 32.3%, and a total loss of 0.1034. Despite these acceptable results, the frames per second (FPS) detection rate was low (average around 5). This low FPS detection prevented the model from being used for real-time lesion detection. Its internal validation obtained a sensitivity, specificity, PPV, and NPV of 98.0%, 95.0%, 98.0%, and 94.0%, respectively. Subsequently, by increasing the number of cases and images available for training (from 81080 to 198941 images), along with improved image quality, the internal validation metrics increased dramatically: mAP from 0.298 to 0.880, F1-score from 0.280 to 0.738, IoU from 32.3% to 83.2%, and total loss decreased from 0.1034 to 0.0975. Sensitivity, specificity, PPV, and NPV for detecting neoplastic lesions in images had similar results to those obtained by Mascarenhas et al. (98.6%, 98.0%, 89.2%, and 99.2%, respectively).52,53,55 However, during the second phase for clinical validation in 170 patients, it was observed that these diagnostic accuracy values decrease and approached those of the endoscopists. When comparing the AI model to the visual impression of endoscopists (experts and non-experts) using two classifications of neoplastic lesions (CRM and Mendoza classification),48,50 it was evident that the AI model was superior to both experts and non-experts.55 This study highlights the importance of conducting clinical validation and not extrapolating internal validation results to the clinical setting as the final outcome.
Figure 5. Detection of images suggestive of neoplasia in cholangioscopy through
artificial intelligence (AIWorks-Cholangioscopy, mdconsgroup, Guayaquil, Ecuador)
Endoscopic Ultrasound
For the diagnostic and therapeutic management of biliopancreatic pathologies, EUS is considered superior to CT scans and MRI due to its higher diagnostic accuracy and ability to obtain higher-quality images.56 However, these procedures have limitations such as low sensitivity to differentiate between benign and malignant intraductal papillary mucinous neoplasia (IPMN) and low specificity to differentiate between malignant lesions and chronic pancreatitis.57 Another limitation of this procedure is its operator-dependence, so less experienced endoscopists may not appreciate the differences between chronic pancreatitis and pancreatic malignancy.57 For this reason, the application of AI in such procedures would be beneficial and could influence the quality of EUS performed by expert endoscopists or trainees.58
Several studies have been developed to evaluate and compare the diagnostic accuracy of AI-assisted EUS versus traditional EUS for detecting pancreatic cancer, and to distinguish between chronic lesions and normal tissue.55 A study by Norton et al. demonstrated that their AI model had higher sensitivity (100%) for the differentiation between malignancy and inflammation; however, the diagnostic accuracy was similar between the model (80%), the endoscopist blinded to procedure results (83.0%) and the traditional procedure (85.0%).59 Their study shows the potential of applying AI models for image interpretation in EUS and the ability to differentiate between malignancy and chronic conditions; this way they solved one of the limitations mentioned above.
Over time, new EUS techniques have been developed and included as part of patient management. At the same time, different types of AI were designed to differentiate between benign and malignant pancreatic lesions. Sãftoiu et al. evaluated the application of an AI model that distinguishes between benign and malignant lesions using elastography, a technique that assesses tissue stiffness and elasticity. Based on these parameters, high sensitivity (91.4%), specificity (87.9%), and diagnostic accuracy (89.7%) were achieved.60 These results indicate the potential for applying AI models in elastography in cases where EUS-guided fine-needle aspiration yields negative results.
Subsequently, a comparison was made between the use of AI-assisted elastography and traditional one. They observed that the diagnostic accuracy of the AI-assisted procedure (EUS + AI) was higher (AUC: 94.0%) than traditional one (UC: 85.0%). This suggests that CNN-based elastography models can provide decision support by offering a quick and accurate image interpretation compared to elastography without AI.61
Another limitation of EUS that has been evaluated is the differentiation between benign and malignant IPMNs. IPMNs are precursors to pancreatic adenocarcinomas, and once these lesions progress to invasive stages, patient prognosis worsens.62
Kuwahara et al., developed a DL model and investigated whether preoperative analysis of IPMN images by EUS using AI can predict malignancy. They compared lesion interpretation with preoperative diagnoses by endoscopists, conventional predictive techniques, and other EUS techniques.63 The AI model achieved an AUC of 91.0% for predicting malignancy. In the comparison of diagnostic accuracy between the model and the endoscopist, the model was superior (94.0% vs. 56.0%, respectively).63
Another application of AI in EUS is for the differential diagnosis of subepithelial lesions (SELs). Hirai et al. evaluated a DL model that allowed classification in EUS images. They collected images of SELs in the upper digestive tract, including gastrointestinal stromal tumors (GISTs), leiomyomas, schwannomas, neuroendocrine tumors (NETs), and ectopic pancreas. The model was able to classify the aforementioned lesions with an accuracy of 86.1%, which was much higher than that of the participating endoscopists.64 The sensitivity, specificity, and diagnostic accuracy for distinguishing GISTs from other lesions were 98.8%, 67.6%, and 89.3%, respectively.64 Other studies, including meta-analyses comparing accuracy of another CNN model to differentiate GISTs from other lesions had similar results.65
Due to the increasing trend of biliopancreatic neoplasms worldwide, it’s important to accurately differentiate malignant tumor lesions from benign or normal tissues. The application of AI in biliopancreatic endoscopy has been under evaluation for a long time, demonstrating promising results. AI and its application in medicine is considered beneficial as it will help overcome the limitations of these complex procedures (cholangioscopy and EUS).
Future Applications of Artificial Intelligence in Digestive Endoscopy
AI has proven to be useful for detecting and classifying lesions during various types of available and studied endoscopic procedures. However, the application of intelligent models can go beyond detection and diagnosis.
Carlos Robles-Medranda et al. developed an EUS system based on CNN models trained to detect normal anatomical structures in different windows evaluated during this advanced procedure (mediastinal, gastric, and duodenal) (Figure 6). This allowed the identification of 20 anatomical structures with high sensitivity and specificity (Table 3).66 The diagnostic accuracy of the model for detecting these structures was higher than 95.0%. This indicates that AI models are not only useful for detecting pathologies but also for detecting normal structures, which can benefit the training of endoscopists and reinforce the knowledge of those more experienced. The application of AI in endoscopist training was evaluated by Zhang et al., who demonstrated that those trained with AI had a shorter learning curve and better results than those trained traditionally.67
Figure 6. Artificial Intelligence model for endoscopic ultrasound detecting
anatomical structures (AIWorks-EUS, mdconsgroup, Guayaquil, Ecuador)
Table 3. Anatomical structures detected by the computer-assisted detection system
AIWorks-EUS (mdconsgroup, Guayaquil, Ecuador)
Additionally, through intelligent models, it is possible to automate the process of obtaining images for report generation and as a measure of test quality.68 This can be achieved by combining various functions and applying them simultaneously, such as detecting the evaluated organ along with detecting lesions within the same organ. In this way, at the end of the procedure, the study report can be automatically generated.68,69 Models such as AI-EARS and ISRGS have shown good diagnostic accuracy in identifying organs and lesions, automatically generating a report using AI in both upper68 and lower69 digestive tracts.
The application of AI models to train endoscopists in advanced endoscopic procedures can increase their effectiveness and reduce the number of procedures needed to achieve competence.
Despite the goal of automating models during endoscopy to reduce risks associated with human and environmental factors, among others, the success of AI models will depend on the quality and quantity of information used during their training and validation. Additionally, external validation through multicenter and international studies with expert endoscopists is of great importance before generalization and universalization of training results.
AI models in digestive endoscopy have the potential to improve the visual perception of endoscopists and reduce the accuracy gap between those with less experience and those considered experts. Moreover, they can be of great assistance in lesion detection and tissue invasion assessment. A future is envisioned where traditional training approaches are surpassed, and medical centers worldwide will be able to implement improvements in professional training by applying precise technologies that reduce the learning curve in these procedures. This will benefit less experienced professionals and reduce the gap between expert endoscopists and those with less experience, ultimately improving their competency.
Intellectual Property. The authors declare that the data, figures, and tables presented in this manuscript are original and were made in their respective institutions.
Funding. The authors declare that there were no external sources of funding.
Conflict of interest. Carlos Robles-Medranda is a key opinion leader for Pentax Medical, Steris, Micro-tech, G-Tech Medical Supply, CREO Medical, EndoSound, and mdconsgroup. The other authors declare no conflict of interest.
Copyright© 2023 Acta Gastroenterológica latinoamericana. This is an
open-access article released under the terms of the Creative Commons Attribution (CC BY-NC-SA 4.0) license,
which allows non-commercial use, distribution, and reproduction, provided the original author and source are
acknowledged.
Cite this article as: Baquerizo-Burgos J, Egas-Izquierdo M, Cunto D et al. The Era of Intelligent Endoscopy: How Artificial Intelligence Empowers Digestive Endoscopy. Acta Gastroenterol Latinoam. 2023;53(3):226-240. https://doi.org/10.52787/agl.v53i3.339
Correspondence: Jorge Baquerizo-Burgos
Email: jorgebaquerizoburgos@gmail.com
Acta Gastroenterol Latinoam 2023;53(3):226-240