Computer vision technologies markedly enhance the automation capabilities of robotic-assisted minimally invasive surgery (RAMIS) through advanced tool tracking, detection, and localization. However, the limited availability of comprehensive surgical datasets for training represents a significant challenge in this field. This research introduces a novel method that employs 3D Gaussian Splatting to generate synthetic surgical datasets. We propose a method for extracting and combining 3D Gaussian representations of surgical instruments and background operating environments, transforming and combining them to generate high-fidelity synthetic surgical scenarios. We developed a data recording system capable of acquiring images alongside tool and camera poses in a surgical scene. Using this pose data, we synthetically replicate the scene, thereby enabling direct comparisons of the synthetic image quality (29.592 PSNR). As a further validation, we compared two YOLOv5 models trained on the synthetic and real data, respectively, and assessed their performance in an unseen real-world test dataset. Comparing the performances, we observe an improvement in neural network performance, with the synthetic-trained model outperforming the real-world trained model by 12%, testing both on real-world data.
计算机视觉技术通过先进的工具跟踪、检测和定位显著提高了机器人辅助微创手术(RAMIS)的自动化能力。然而,用于训练的综合外科数据集的有限可用性在该领域代表了一个重大挑战。本研究引入了一种新颖的方法,利用3D高斯喷溅技术生成合成外科数据集。我们提出了一种方法,用于提取和组合外科器械和背景手术环境的3D高斯表示,对它们进行变换和组合以生成高保真度的合成手术场景。我们开发了一个数据记录系统,能够在手术场景中同时获取图像以及工具和相机的姿态。使用这些姿态数据,我们合成地复制了场景,从而能够直接比较合成图像的质量(29.592 PSNR)。作为进一步的验证,我们比较了分别在合成数据和真实数据上训练的两个YOLOv5模型,并评估了它们在未见过的真实世界测试数据集上的表现。通过比较性能,我们观察到神经网络性能有所提升,在真实世界数据的测试中,合成数据训练的模型比真实世界数据训练的模型表现提高了12%。