Cynthia Wallace
BAE Space and Mission Systems, Boulder, CO, United States
Publications
-
Research Article
Evaluation of Advanced Artificial Intelligence in Minimally Invasive Surgery Training: A Preliminary Study of the Large Language Models DeepSeek-R1 and Claude 3.5 Sonnet
Author(s): Brandon L. Staple*, Elijah M. Staple, Cynthia Wallace and Bevan D. Staple
Background/Objective: Minimally invasive surgery (MIS) reduces tissue trauma, pain, and recovery times but demands advanced technical skill acquisition. Current surgical training remains time-intensive and mentor-dependent. While robotics, simulation, and AI promise transformative improvements for surgical education, early large language models (LLMs) like ChatGPT raised concerns due to factual inaccuracies ("hallucinations") and limited explainability. It remains unclear whether modern LLMs—such as Claude 3.5 Sonnet and the reasoning-focused DeepSeek-R1— adequately overcome these limitations while ensuring the interpretability and reliability essential for medical applications. Moreover, their alignment with MIS-specific knowledge is understudied. This work preliminarily evaluates both models’ accuracy, reasoning capabilities, and error pa.. Read More»
