Deep learning implementation for extrahepatic bile duct detection during indocyanine green fluorescence-guided laparoscopic cholecystectomy: pilot study

Abstract

Background

A real-time deep learning system was developed to identify the extrahepatic bile ducts during indocyanine green fluorescence-guided laparoscopic cholecystectomy.

Methods

Two expert surgeons annotated surgical videos from 113 patients and six class structures. YOLOv7, a real-time object detection model that enhances speed and accuracy in identifying and localizing objects within images, was trained for structures identification. To evaluate the model's performance, single-frame and short video clip validations were used. The primary outcomes were average precision and mean average precision in single-frame validation. Secondary outcomes were accuracy and other metrics in short video clip validations. An intraoperative prototype was developed for the verification experiments.

Results

A total of 3993 images were extracted to train the YOLOv7 model. In single-frame validation, all classes’ mean average precision was 0.846, and average precision for the common bile duct and cystic duct was 0.864 and 0.698 respectively. The model was trained to detect six different classes of objects and exhibited the best overall performance, with an accuracy of 94.39% for the common bile duct and 84.97% for the cystic duct in video clip validation.

Conclusion

This model could potentially assist surgeons in identifying the critical landmarks during laparoscopic cholecystectomy, thereby minimizing the risk of bile duct injuries.

Introduction

Bile duct injury (BDI) remains a serious complication, occurring in 0.2–1.5% cases^1,2 during laparoscopic cholecystectomy (LC). Misinterpretation of biliary anatomy contributes to 71–97% of BDIs^3,4. Intraoperative extrahepatic bile duct visualization techniques have been implemented, such as intraoperative ultrasonography and cholangiography, but its use is often limited to selected cases due to the prolonged surgery duration, requirement for additional medical resources, and need for specific surgical expertise^5,6.

Fluorescence-guided surgery with indocyanine green (ICG) has emerged as a promising technique for enhancing visualization of biliary structures, and the visualization rates of the common bile duct (CBD) can reach 66% before the dissection of Calot’s triangle and 94% afterwards⁷. Enhanced contrast and delineation of biliary structures may enable surgeons to avoid variability in fluorescence intensity.

The integration of artificial intelligence (AI) into surgical laparoscopy has demonstrated considerable promise, particularly in LC. Several studies have focused on using semantic segmentation to identify ‘go’ and ‘no-go’ zones^8,9 as well as to automatically assess the critical view of safety (CVS) during LC^10–12. A study reported low average precision (AP) scores of 0.320 for the CBD and 0.074 for the cystic duct¹³. The challenges stem from factors like background noise, variable image quality, and the presence of multiple objects in the field, intra-abdominal fat and fibrous tissues, complicating accurate demarcation.

The aim of this study was to combine the benefits of fluorescence-guided surgery and deep learning by training YOLOv7 models to accurately identify anatomical landmarks during LC, with a specific focus on extrahepatic bile duct identification. The study also aimed to assess the performance of these models under varying surgical difficulties while developing a prototype landmark indication system for use in the operating room.

Methods

Data set preparation

This study was approved by the Institutional Review Board of Chang Gung Medical Foundation (registration date: 9 June 2022; registration number: 202200847B0). Videos were collected from patients who underwent ICG fluorescence-guided LC between June 2022 and April 2023. Patients who had subtotal cholecystectomy or required conversion to open surgery were excluded. All patients received 2 ml of intravenous ICG 15 min before the operation. Four surgeons, following a standardized surgical protocol, performed the procedures. The Nassar operative difficulty scale¹⁴, based on operative findings of gallbladder, cystic pedicle, and associated adhesions, was recorded for each patient.

From each LC video, frames were extracted at a rate of 1 frame per second from the moment the gallbladder was grasped before the dissection of the hepatocystic triangle, continuing until the transection of the cystic structures was complete. Subsequently, 35–40 frames were selected from each patient by two expert surgeons. Frames where the Calot’s triangle was completely out of view, or when the image was obscured by factors such as mist, blood, or poor focus, images during adhesiolysis or gallbladder take-down, were excluded to avoid introducing noise. The data set was randomly stratified at the patient level into training, validation, and testing sets, with the distribution of frames approximately following an 8:1:1 ratio.

Data annotations and augmentation

Six intraoperative structures, including liver, gallbladder, CBD, instrument, cystic artery, and cystic duct, were annotated by two experienced surgeons with over 200 LC procedures. A senior surgeon, with over 500 LC procedures, validated the annotations. During the YOLOv7 training process, data augmentation techniques, including flipping, cropping, rotation, and adjustments to saturation and brightness, were applied to ensure diverse representation of surgical scenarios.

Deep learning model training and performance validation

A modern deep learning model, the reliable object detector You Only Look Once (YOLOv7), was applied. This model was trained using a dataset comprising LC images. YOLOv7 was selected over several competing object detectors because of its notable advancements in speed and accuracy, mainly attributed to its newly designed trainable bag-of-freebies methods, re-parameterized module, and dynamic label assignment strategy¹⁵. Its architecture facilitates rapid detection of complex geometric relationships between various anatomical structures, rendering it ideal for real-time surgical applications. Three models were trained to determine the most effective CBD identification method (the complete workflow is illustrated in Fig. 1).

Fig. 1

Workflow diagram illustrating a YOLOv7 network training process and two types of performance validation methods

Open in new tab Download slide

CBD, common bile duct; ELAN, Efficient Layer Aggregation Network; CBS, Convolution + Batch Normalization + SiLU; CBM, Convolution + Batch Normalization + Sigmoid; MPConv, MaxPooling + Convolution; Concat, Concatenation; SPPCSPC, Spatial Pyramid Pooling Cross Stage Partial Connection; Rep, Re-parameterization.

Evaluation of model performance

Two validation methods, single-frame validation and short-video clip validation, were employed to evaluate the YOLOv7 model’s performance on still images and videos.

Single-frame validation

To evaluate the performance of the YOLOv7 model, predictions for the presence and localization of anatomical structures in test images were generated using bounding boxes and compared to the ‘ground truth’ annotations provided by hepato-pancreato-biliary (HPB) experts. The model's performance was assessed using precision, sensitivity, AP, mean average precision (mAP), and the F1-score, all of which were automatically computed by the model.

Precision, also referred to as positive predictive value (PPV) in medical statistics, measures the percentage of correctly identified anatomical structures (true positives) out of all the structures the model predicts as positive. Sensitivity, also known as recall, assesses how well the model detects all actual anatomical structures present in the image, capturing the proportion of true positives among all relevant instances. The F1-score provides a balanced measure by combining both precision and sensitivity, offering insight into the model’s ability to minimize false positives and false negatives.

For each structure, AP summarizes the model’s accuracy across different thresholds, representing the ability to correctly identify a specific structure (for example CBD). mAP is the AP across all structures, providing an overall evaluation of the model’s performance. The mAP is calculated with an overlap criterion (IoU threshold of 0.5), meaning the model’s predicted boundaries must overlap with the ground truth by at least 50% for a correct match (detailed description in the Supplementary Methods section).

Short-video clip validation

In practical surgical applications, the YOLOv7 model's ability to accurately detect the duct in portions of a video above a particular threshold can provide valuable information for image recognition during surgery. The model's performance was evaluated in real-world applications from a different perspective than single-frame validation. In this context, the 6-class model was applied for short video clip validation.

The evaluation methodology was as follows: Each patient's surgical video was edited into a 2–3 min key video and segmented into 5-second clips using computer vision techniques. The 6-class YOLOv7 model performed inferences on these clips, with only the bounding boxes for the CBD and cystic duct retained for analysis. The number of positive detections for each extrahepatic bile duct was evaluated. If detections exceeded a threshold of 10 per clip, the visibility of the extrahepatic bile duct was marked as ‘positive,’ otherwise as ‘negative.’ Two HPB surgeons validated these findings, and the model’s performance was assessed using accuracy, sensitivity, specificity, and F1-score.

Results

This study analysed ICG fluorescence-guided LC videos from 113 patients. Indications for LC included gallbladder polyps (4.42%, n = 5), gallbladder stones with biliary colic (28.32%, n = 32), acute cholecystitis (17.7%, n = 20), chronic cholecystitis (29.2%, n = 33), choledocholithiasis (19.47%, n = 22), and biliary pancreatitis (0.88% n = 1). Laparoscopic common bile duct exploration was required in four patients (3.54%, n = 4) due to unsuccessful preoperative endoscopic stone extraction. Postoperative complications included wound infections (3.53%, n = 4), intra-abdominal abscesses (2.65%, n = 3), and bile leakage (1.76% n = 2), with no CBD injuries. Nassar grading indicated 15.92% (n = 18), 55.7% (n = 63), 22.1% (n = 25), and 6.1% (n = 7) of patients as grades I, II, III, and IV respectively. Patient characteristics are shown in Table 1.

Table 1

Open in new tab

Patients’ demographics and characteristics

Patients’ demographics and characteristics	Total (n = 113 patients)
Age (years), mean(s.d.)	57.02(15.63)
Sex
Male	45 (39.82)
Female	68 (50.18)
Surgical indication
Gallbladder polyps	5 (4.42)
Gallbladder stones with biliary colic	32 (28.32)
Acute cholecystitis	20 (17.7)
Chronic cholecystitis	33 (29.2)
Choledocholithiasis	22 (19.47)
Biliary pancreatitis	1 (0.88)
Combine other surgical procedure
CBD exploration	4 (3.54)
Gastric neoplasm excision	1 (0.9)
Complications
Wound infection	4 (3.53)
Postoperative IAI	3 (2.65)
Postoperative bile leakage	2 (1.76)
Nassar scale
1	18 (15.92)
2	63 (55.7)
3	25 (22.1)
4	7 (6.1)

Patients’ demographics and characteristics	Total (n = 113 patients)
Age (years), mean(s.d.)	57.02(15.63)
Sex
Male	45 (39.82)
Female	68 (50.18)
Surgical indication
Gallbladder polyps	5 (4.42)
Gallbladder stones with biliary colic	32 (28.32)
Acute cholecystitis	20 (17.7)
Chronic cholecystitis	33 (29.2)
Choledocholithiasis	22 (19.47)
Biliary pancreatitis	1 (0.88)
Combine other surgical procedure
CBD exploration	4 (3.54)
Gastric neoplasm excision	1 (0.9)
Complications
Wound infection	4 (3.53)
Postoperative IAI	3 (2.65)
Postoperative bile leakage	2 (1.76)
Nassar scale
1	18 (15.92)
2	63 (55.7)
3	25 (22.1)
4	7 (6.1)

Values are n (%) unless otherwise indicated. Nassar scale: grading of laparoscopic cholecystectomy difficulty based on intraoperative findings. CBD, common bile duct; IAI, intrabdominal infection¹⁴.

Table 1

Open in new tab

Patients’ demographics and characteristics

Patients’ demographics and characteristics	Total (n = 113 patients)
Age (years), mean(s.d.)	57.02(15.63)
Sex
Male	45 (39.82)
Female	68 (50.18)
Surgical indication
Gallbladder polyps	5 (4.42)
Gallbladder stones with biliary colic	32 (28.32)
Acute cholecystitis	20 (17.7)
Chronic cholecystitis	33 (29.2)
Choledocholithiasis	22 (19.47)
Biliary pancreatitis	1 (0.88)
Combine other surgical procedure
CBD exploration	4 (3.54)
Gastric neoplasm excision	1 (0.9)
Complications
Wound infection	4 (3.53)
Postoperative IAI	3 (2.65)
Postoperative bile leakage	2 (1.76)
Nassar scale
1	18 (15.92)
2	63 (55.7)
3	25 (22.1)
4	7 (6.1)

Patients’ demographics and characteristics	Total (n = 113 patients)
Age (years), mean(s.d.)	57.02(15.63)
Sex
Male	45 (39.82)
Female	68 (50.18)
Surgical indication
Gallbladder polyps	5 (4.42)
Gallbladder stones with biliary colic	32 (28.32)
Acute cholecystitis	20 (17.7)
Chronic cholecystitis	33 (29.2)
Choledocholithiasis	22 (19.47)
Biliary pancreatitis	1 (0.88)
Combine other surgical procedure
CBD exploration	4 (3.54)
Gastric neoplasm excision	1 (0.9)
Complications
Wound infection	4 (3.53)
Postoperative IAI	3 (2.65)
Postoperative bile leakage	2 (1.76)
Nassar scale
1	18 (15.92)
2	63 (55.7)
3	25 (22.1)
4	7 (6.1)

The performance of the three evaluated models using the single-frame validation method was presented by AP bar chart (Fig. 2 and Table 2). The first model was trained with all six classes: liver, gallbladder, CBD, surgical instruments, cystic artery, and cystic duct. The second model excluded the cystic duct and cystic artery to address the imbalance in size and frequency of these structures. The third model focused solely on the extrahepatic bile ducts, using only CBD and cystic duct for training.

Fig. 2

Average precision computed for each anatomical structures in the single-frame validation

Open in new tab Download slide

mAP, mean average precision; CBD, common bile duct; GB, gallbladder.

Table 2

Open in new tab

Evaluation matrix of three different YOLOv7 models based on single-frame validation method

Target structure	Precision	Sensitivity	F1-score
Model development: 6-class
All-class	0.909	0.856	0.882
CBD	0.903	0.872	0.887
Gallbladder	0.954	0.94	0.947
Cystic duct	0.886	0.722	0.796
Cystic artery	0.851	0.711	0.775
Instrument	0.947	0.955	0.951
Liver	0.916	0.938	0.927
Model development: 4-class
All-class	0.892	0.864	0.878
CBD	0.836	0.718	0.773
Gallbladder	0.927	0.889	0.908
Instrument	0.928	0.945	0.936
Liver	0.878	0.903	0.890
Model development: 2-class
All-class	0.771	0.469	0.583
CBD	0.903	0.602	0.722
Cystic duct	0.639	0.336	0.440

Target structure	Precision	Sensitivity	F1-score
Model development: 6-class
All-class	0.909	0.856	0.882
CBD	0.903	0.872	0.887
Gallbladder	0.954	0.94	0.947
Cystic duct	0.886	0.722	0.796
Cystic artery	0.851	0.711	0.775
Instrument	0.947	0.955	0.951
Liver	0.916	0.938	0.927
Model development: 4-class
All-class	0.892	0.864	0.878
CBD	0.836	0.718	0.773
Gallbladder	0.927	0.889	0.908
Instrument	0.928	0.945	0.936
Liver	0.878	0.903	0.890
Model development: 2-class
All-class	0.771	0.469	0.583
CBD	0.903	0.602	0.722
Cystic duct	0.639	0.336	0.440

CBD, common bile duct.

Table 2

Open in new tab

Evaluation matrix of three different YOLOv7 models based on single-frame validation method

Target structure	Precision	Sensitivity	F1-score
Model development: 6-class
All-class	0.909	0.856	0.882
CBD	0.903	0.872	0.887
Gallbladder	0.954	0.94	0.947
Cystic duct	0.886	0.722	0.796
Cystic artery	0.851	0.711	0.775
Instrument	0.947	0.955	0.951
Liver	0.916	0.938	0.927
Model development: 4-class
All-class	0.892	0.864	0.878
CBD	0.836	0.718	0.773
Gallbladder	0.927	0.889	0.908
Instrument	0.928	0.945	0.936
Liver	0.878	0.903	0.890
Model development: 2-class
All-class	0.771	0.469	0.583
CBD	0.903	0.602	0.722
Cystic duct	0.639	0.336	0.440

Target structure	Precision	Sensitivity	F1-score
Model development: 6-class
All-class	0.909	0.856	0.882
CBD	0.903	0.872	0.887
Gallbladder	0.954	0.94	0.947
Cystic duct	0.886	0.722	0.796
Cystic artery	0.851	0.711	0.775
Instrument	0.947	0.955	0.951
Liver	0.916	0.938	0.927
Model development: 4-class
All-class	0.892	0.864	0.878
CBD	0.836	0.718	0.773
Gallbladder	0.927	0.889	0.908
Instrument	0.928	0.945	0.936
Liver	0.878	0.903	0.890
Model development: 2-class
All-class	0.771	0.469	0.583
CBD	0.903	0.602	0.722
Cystic duct	0.639	0.336	0.440

CBD, common bile duct.

The first model achieved a mAP of 0.846. Within this model, the AP for the CBD reached 0.864, and the cystic duct had an acceptable AP of 0.698. The AP for gallbladder was 0.936, for cystic artery 0.704, for instruments 0.944, and for liver 0.929. The second model resulted in a decreased overall mAP (0.83) and a decreased AP for the CBD (0.664). The AP values for the other surgical landmarks in the second model remained similar to those in the first model: gallbladder, 0.97; instruments, 0.92; and liver, 0.865. The third model also showed a significant decline in AP for CBD at 0.59 and the cystic duct at 0.235 (the comparison is illustrated in Fig. 3).

Fig. 3

Precision-recall curve of three different trained models

a Six-class model trained on the common bile duct (CBD), liver, gallbladder (GB), cystic duct, cystic artery, and instruments. b Four-class model trained on the CBD, liver, GB, and instruments. c Two-class model trained on CBD and cystic duct only. mAP, Mean average precision.

Open in new tab Download slide

The first model demonstrated superior performance in recognising the CBD in patients with symptomatic gallbladder stones and gallbladder polyps but exhibited lower accuracy in detecting the cystic duct in the cholecystitis group (Table S1, Fig. 4 and Fig. S1).

Fig. 4

Model predictions for common bile duct (CBD) in different Nassar intraoperative difficulty cases

Four examples of successful model predictions for different Nassar grade images compared to the original frames are shown. The bounding box (bbx) associated with various structures is as follows: CBD, cystic duct, liver, gallbladder, and instruments.

Open in new tab Download slide

The mean accuracy for the CBD and cystic duct using the 6 class YOLOv7 was 94.39% and 84.97%, the sensitivity was 99.36% and 87.98%, the specificity was 19.35% and 88.89%, and F1-score 97.08% and 85.92% respectively (Table 3). The model demonstrated commendable proficiency in detecting CBD across various Nassar grades, maintaining high accuracy and sensitivity, even with increased Nassar grade complexity, but for the cystic duct in Nassar grade IV patients the detection rate decreased to only 55.88% and 60.61% respectively.

Table 3

Open in new tab

Evaluation matrix of 6-class YOLOv7 model for extrahepatic bile duct based on video-clip validation method

	Accuracy	Precision	Sensitivity	Specificity	F1-score
All (n = 499)
CBD	94.39	94.90	99.36	19.35	97.08
CD	84.97	89.82	87.98	78.48	88.89
Nassar I (n = 103)
CBD	100.00	100.00	100.00	NA	100.00
CD	93.20	95.52	94.12	91.43	94.81
Nassar II (n = 164)
CBD	95.42	96.60	98.61	44.44	97.59
CD	88.24	97.22	87.50	90.91	92.11
Nassar III (n = 165)
CBD	94.86	94.80	100.00	18.18	97.33
CD	88.57	90.98	92.50	80.00	91.74
Nassar IV (n = 67)
CBD	82.35	83.58	98.25	NA	90.32
CD	55.88	54.05	60.61	51.43	57.14

	Accuracy	Precision	Sensitivity	Specificity	F1-score
All (n = 499)
CBD	94.39	94.90	99.36	19.35	97.08
CD	84.97	89.82	87.98	78.48	88.89
Nassar I (n = 103)
CBD	100.00	100.00	100.00	NA	100.00
CD	93.20	95.52	94.12	91.43	94.81
Nassar II (n = 164)
CBD	95.42	96.60	98.61	44.44	97.59
CD	88.24	97.22	87.50	90.91	92.11
Nassar III (n = 165)
CBD	94.86	94.80	100.00	18.18	97.33
CD	88.57	90.98	92.50	80.00	91.74
Nassar IV (n = 67)
CBD	82.35	83.58	98.25	NA	90.32
CD	55.88	54.05	60.61	51.43	57.14

Values are %. For extrahepatic bile duct structures, performance is presented in terms of five common metrics: accuracy, sensitivity, specificity, and F1-score (harmonic mean of recall and precision). CBD, common bile duct; CD, cystic duct; n, number of video-clips; NA, not applicable.

Table 3

Open in new tab

Evaluation matrix of 6-class YOLOv7 model for extrahepatic bile duct based on video-clip validation method

	Accuracy	Precision	Sensitivity	Specificity	F1-score
All (n = 499)
CBD	94.39	94.90	99.36	19.35	97.08
CD	84.97	89.82	87.98	78.48	88.89
Nassar I (n = 103)
CBD	100.00	100.00	100.00	NA	100.00
CD	93.20	95.52	94.12	91.43	94.81
Nassar II (n = 164)
CBD	95.42	96.60	98.61	44.44	97.59
CD	88.24	97.22	87.50	90.91	92.11
Nassar III (n = 165)
CBD	94.86	94.80	100.00	18.18	97.33
CD	88.57	90.98	92.50	80.00	91.74
Nassar IV (n = 67)
CBD	82.35	83.58	98.25	NA	90.32
CD	55.88	54.05	60.61	51.43	57.14

	Accuracy	Precision	Sensitivity	Specificity	F1-score
All (n = 499)
CBD	94.39	94.90	99.36	19.35	97.08
CD	84.97	89.82	87.98	78.48	88.89
Nassar I (n = 103)
CBD	100.00	100.00	100.00	NA	100.00
CD	93.20	95.52	94.12	91.43	94.81
Nassar II (n = 164)
CBD	95.42	96.60	98.61	44.44	97.59
CD	88.24	97.22	87.50	90.91	92.11
Nassar III (n = 165)
CBD	94.86	94.80	100.00	18.18	97.33
CD	88.57	90.98	92.50	80.00	91.74
Nassar IV (n = 67)
CBD	82.35	83.58	98.25	NA	90.32
CD	55.88	54.05	60.61	51.43	57.14

Discussion

This study validated an object detection model to detect critical surgical landmarks during fluorescence-guided LC. By integrating fluorescence imaging and deep learning, the model achieved an AP of 0.864 for CBD detection, demonstrating feasible performance across patients with varying degrees of surgical complexity, including grade IV cases.

AI-deep learning technology has been assessed for objective evaluation of intraoperative images during LC. Madani et al. and Laplante et al. developed segmentation models to differentiate safe (Go) and dangerous (No-Go) dissection zones. Mascagni et al. demonstrated the potential of deep neural networks for automated segmentation of hepatocystic anatomy in laparoscopic images with early-stage clinical evaluation¹⁶. Tatsushi et al. used the YOLOv3 algorithm to analyse landmark indications in 22 of 23 LC videos, achieving relatively low AP scores. Notably, their study excluded videos with significant fibrosis, scarring, bleeding, or less-visible landmarks.

Compared to other object detection models used in conventional LC, the AI model in this study introduced key advancements. The integration of ICG fluorescence enhanced both intraoperative visualization and functions as data preprocessing, improving model training by facilitating recognition of fluorescent structures like the CBD and cystic duct. YOLOv7 offers significant improvements over previous YOLO models, with re-parameterized modules and dynamic label assignment strategies reducing parameters by 40% and computational demands by 50%, while delivering faster and more accurate detection. The study hypothesized that including fluorescent anatomical structures and instruments in training enhanced recognition across different surgical scenes.

Tashiro et al. used an AI model with ICG fluorescence to identify loose connective tissue during LC in real time, achieving a Dice coefficient of 0.60¹⁷. Other models combined AI and ICG fluorescence, mainly in colorectal cancer detection, classifying cancer progression with high accuracy and minimal background noise due to delayed ICG retention¹⁸. Another model applied AI and ICG fluorescence angiography in flap reconstruction, accurately identifying excisable tissue based on fluorescence intensity¹⁹. Compared to traditional methods, AI with ICG fluorescence requires fewer annotated data sets while maintaining or surpassing accuracy, enhancing visualization, and reducing manual annotation effort. With increasing use of ICG-guided surgeries, this combination offers a cost-effective approach to improve surgical precision and real-time decision-making.

Before AI advancements, procedures like CVS and fluorescence-guided surgery were used to reduce BDI. Recent data from Japan’s National Clinical Database indicate that BDI rates have remained steady over the last decade²⁰. Both methods have limitations. A photo evaluation study revealed that only 26.8% of patients without cholecystitis had satisfactory CVS. In contrast, only 15.7% of patients with acute cholecystitis achieved a satisfactory CVS²¹. Stefanidis et al. also reported inconsistent adherence to CVS criteria, with only one-quarter of surgeons properly following them²². Fluorescence-guided LC presents challenges as well, with variations in bile duct fluorescence intensity due to tissue depth, diminishing effects over time, and background fluorescence interference. In this study, the AI-deep learning system integrated anatomical and fluorescence features to identify the CBD in real-time, aiding in achieving CVS. By continuously tracking the CBD throughout the dissection and before cystic duct transection, the system reduced the risk of BDI during LC.

In the subgroup analysis, the model’s recognition of the CBD and cystic duct showed little variation across different clinical indications. Higher Nassar grading scales were associated with more difficult dissections and reduced fluorescence visibility, significantly altering the appearance of anatomical structures. Although poor cystic duct recognition in Nassar grade IV cases was found, CBD identification remained feasible.

This study had several limitations. Although it included patients with varying degrees of surgical difficulty for model training, the overall sample size was small, and the distribution of Nassar grading scale cases was uneven. This imbalance may lead the model to perform better in simpler cases, limiting its effectiveness in more complex scenarios. Given the low clinical incidence of BDI at approximately 1% in LC patients, a large-scale, multicentre trial would be required to assess whether the AI tool can directly reduce BDI rates. The data set originated from a single centre with a consistent near-infrared camera system and ICG injection protocol, suggesting that further cross-validation and transfer learning may be needed to adapt the model to other systems. Variability in fluorescence visibility, influenced by factors such as visceral fat, inflammation, background fluorescence, and the duration of the ICG effect, presents a challenge for model training. Insufficient fluorescence intensity can lead to confusion between the cystic duct, cystic artery, and CBD. The manual selection of frames is time-consuming and may introduce human bias, potentially leading to false positives when visibility is poor during adhesiolysis or gallbladder take-down, underscoring the need for a comprehensive training data set.

In conclusion, this study demonstrated the feasibility of integrating an AI-deep learning system with fluorescence-guided surgery to identify extrahepatic bile ducts and anatomical structures during laparoscopic cholecystectomy. Future research should focus on improving cystic duct detection, expanding the data set for broader validation, and further exploring AI’s role in enhancing patient safety and surgical outcomes.

Funding

Chang Gung Memorial Hospital, grant number CMRPG8M1501 to Shih-Min Yin, funded this research.

Acknowledgements

The authors thank the colleagues of the general surgery department for the surgical video recording. J.-J.J.L. and I.-M.C. share the same contribution as the corresponding author.

Disclosure

The authors declare no conflict of interest.

Supplementary material

Supplementary material is available at BJS Open online.

Data availability

The data supporting this study’s findings are available from the corresponding author upon reasonable request.

Author contributions

Shih Min Yin (Conceptualization, Data curation, Funding acquisition, Project administration, Resources, Visualization, Writing—original draft, Writing—review & editing), I-Min Chiu (Formal analysis, Investigation, Methodology, Software, Supervision, Writing—review & editing), and Jenn-Jier Lien (Methodology, Resources, Supervision, Writing—review & editing).

References

Mangieri

Hendren

Strode

Bandera

Faler

Bile duct injuries (BDI) in the advanced laparoscopic cholecystectomy era

Surg Endosc

2019

;

724

–

730

de'Angelis

Catena

Memeo

Coccolini

Martínez-Pérez

Romeo

et al.

2020 WSES guidelines for the detection and management of bile duct injury during cholecystectomy

World J Emerg Surg

2021

;

Way

Stewart

Gantert

Liu

Lee

Whang

et al.

Causes and prevention of laparoscopic bile duct injuries: analysis of 252 cases from a human factors and cognitive psychology perspective

Ann Surg

2003

;

237

460

–

469

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

Pesce

Portale

Minutolo

Scilletta

Li Destri

Puleo

Bile duct injury during laparoscopic cholecystectomy without intraoperative cholangiography: a retrospective study on 1,100 selected patients

Dig Surg

2012

;

310

–

314

Sheffield

Han

Kuo

Townsend

Jr,

Goodwin

Riall

Variation in the use of intraoperative cholangiography during cholecystectomy

J Am Coll Surg

2012

;

214

668

–

679

;

discussion 679–81

Machi

Tateishi

Oishi

Furumoto

Oishi

Uchida

et al.

Laparoscopic ultrasonography versus operative cholangiography during laparoscopic cholecystectomy: review of the literature and a comparison with open intraoperative ultrasonography

J Am Coll Surg

1999

;

188

360

–

367

Pesce

Latteri

Barchitta

Portale

Di Stefano

Agodi

et al.

Near-infrared fluorescent cholangiography—real-time visualization of the biliary tree during elective laparoscopic cholecystectomy

HPB (Oxford)

2018

;

538

–

545

Madani

Namazi

Altieri

Hashimoto

Rivera

Pucher

et al.

Artificial intelligence for intraoperative guidance: using semantic segmentation to identify surgical anatomy during laparoscopic cholecystectomy

Ann Surg

2022

;

276

363

–

369

Laplante

Namazi

Kiani

Hashimoto

Alseidi

Pasten

et al.

Validation of an artificial intelligence platform for the guidance of safe laparoscopic cholecystectomy

Surg Endosc

2023

;

2260

–

2268

Mascagni

Vardazaryan

Alapatt

Urade

Emre

Fiorillo

et al.

Artificial intelligence for surgical safety: automatic assessment of the critical view of safety in laparoscopic cholecystectomy using deep learning

Ann Surg

2022

;

275

955

–

961

Mascagni

Alapatt

Laracca

Guerriero

Spota

Fiorillo

et al.

Multicentric validation of EndoDigest: a computer vision platform for video documentation of the critical view of safety in laparoscopic cholecystectomy

Surg Endosc

2022

;

8379

–

8386

Kawamura

Endo

Fujinaga

Orimoto

Amano

Kawasaki

et al.

Development of an artificial intelligence system for real-time intraoperative assessment of the critical view of safety in laparoscopic cholecystectomy

Surg Endosc

2023

;

8755

–

8763

Tokuyasu

Iwashita

Matsunobu

Kamiyama

Ishikake

Sakaguchi

et al.

Development of an artificial intelligence system using deep learning to indicate anatomical landmarks during laparoscopic cholecystectomy

Surg Endosc

2021

;

1651

–

1658

Nassar

Ashkar

Mohamed

Hafiz

Is laparoscopic cholecystectomy possible without video technology?

Minimally Invasive Therapy

1995

;

–

Google Scholar

Crossref

WorldCat

Wang

C-Y

Bochkovskiy

Liao

H-YM

. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC. IEEE, 2023, 7464–7475. doi:

10.1109/CVPR52729.2023.00721

Mascagni

Alapatt

Lapergola

Vardazaryan

Mazellier

J-P

Dallemagne

et al.

Early-stage clinical evaluation of real-time artificial intelligence assistance for laparoscopic cholecystectomy

Br J Surg

2024

;

111

znad353

Tashiro

Aoki

Kobayashi

Tomioka

Saito

Matsuda

et al.

Novel navigation for laparoscopic cholecystectomy fusing artificial intelligence and indocyanine green fluorescent imaging

J Hepatobiliary Pancreat Sci

2024

;

305

–

307

Kim

Yoon

Kim

Hong

S-M

Kim

et al.

Investigation of artificial intelligence integrated fluorescence endoscopy image analysis with indocyanine green for interpretation of precancerous lesions in colon cancer

PLoS One

2023

;

e0286189

Singaravelu

Dalli

Potter

Cahill

Artificial intelligence for optimum tissue excision with indocyanine green fluorescence angiography for flap reconstructions: proof of concept

JPRAS Open

2024

;

389

–

393

Shiroshita

Inomata

Akira

Kanayama

Yamaguchi

Eguchi

et al.

Current status of endoscopic surgery in Japan: the 15th national survey of endoscopic surgery by the Japan Society for Endoscopic Surgery

Asian J Endosc Surg

2022

;

415

–

426

Terho

Sallinen

Lampela

Harju

Koskenvuo

Mentula

The critical view of safety and bile duct injuries in laparoscopic cholecystectomy: a photo evaluation study on 1532 patients

HPB (Oxford)

2021

;

1824

–

1829

Stefanidis

Chintalapudi

Anderson-Montoya

Oommen

Tobben

Pimentel

How often do surgeons obtain the critical view of safety during laparoscopic cholecystectomy?

Surg Endosc

2017

;

142

–

146

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Download all slides

Month:	Total Views:
March 2025	344
April 2025	152
May 2025	12

Article Contents

Deep learning implementation for extrahepatic bile duct detection during indocyanine green fluorescence-guided laparoscopic cholecystectomy: pilot study

Abstract

Introduction

Methods

Data set preparation

Data annotations and augmentation

Deep learning model training and performance validation

Evaluation of model performance

Single-frame validation

Short-video clip validation

Results

Discussion

Funding

Acknowledgements

Disclosure

Supplementary material

Data availability

Author contributions

References

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Most Read

Most Cited

Article Contents

Deep learning implementation for extrahepatic bile duct detection during indocyanine green fluorescence-guided laparoscopic cholecystectomy: pilot study

Abstract

Introduction

Methods

Data set preparation

Data annotations and augmentation

Deep learning model training and performance validation

Evaluation of model performance

Single-frame validation

Short-video clip validation

Results

Discussion

Funding

Acknowledgements

Disclosure

Supplementary material

Data availability

Author contributions

References

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Most Read

Most Cited

This Feature Is Available To Subscribers Only