Validation of Virtual Reality Arthroscopy Simulator Relevance in Characterising Experienced Surgeons

Author	Tronchot A, Berthelemy J, Thomazeau H, Huaulmé A, Walbron P, Sirveaux F, Jannin P.

Background

Virtual reality (VR) simulation is particularly suitable for learning arthroscopy skills. Despite significant research, one drawback often outlined is the difficulty in distinguishing performance levels (Construct Validity) in experienced surgeons. Therefore, it seems adequate to search new methods of performance measurements using probe trajectories instead of commonly used metrics.

Hypothesis

It was hypothesized that a larger experience in surgical shoulder arthroscopy would be correlated with better performance on a VR shoulder arthroscopy simulator and that experienced operators would share similar probe trajectories.

Materials & Method

After answering to standardized questionnaires, 104 trajectories from 52 surgeons divided into 2 cohorts (26 intermediates and 26 experts) were recorded on a shoulder arthroscopy simulator. The procedure analysed was the "loose body removal" in a right shoulder joint. 10 metrics were computed on the trajectories including procedure duration, overall path length, economy of motion and smoothness. Additionally, Dynamic Time Warping (DTW) was computed on the trajectories for unsupervised hierarchical clustering of the surgeons.

Results

Experts were significantly faster (Median 70.9s Interquartile range [56.4-86.3] vs. 116.1s [82.8-154.2], p<0.01), more fluid (4.6.105mm.s-3 [3.1.105-7.2.105] vs. 1.5.106mm.s-3 [2.6.106-3.5.106], p=0.05), and economical in their motion (19.3mm² [9.1-25.9] vs. 33.8mm² [14.8-50.5], p<0.01), but there was no significant difference in performance for path length (671.4mm [503.8-846.1] vs 694.6mm [467.0-1090.1], p=0.62). The DTW clustering differentiates two expertise related groups of trajectories with performance similarities, respectively including 48 expert trajectories for the first group and 52 intermediates and 4 expert trajectories for the second group (Sensitivity of 92%, Specificity of 100%). Hierarchical clustering with DTW significantly identified expert operators from intermediate operators and found trajectory similarities among 24/26 experts.

Conclusions

This study demonstrated the Construct Validity of the VR shoulder arthroscopy simulator within groups of experienced surgeons. With new types of metrics simply based on the simulator's raw trajectories, it was possible to significantly distinguish levels of expertise. We demonstrated that clustering analysis with Dynamic Time Warping was able to reliably discriminate between expert operators and intermediate operators.

Clinical Relevance

The results have implications for the future of arthroscopic surgical training or post-graduate accreditation programs using virtual reality simulation.

Level of evidence: III; prospective comparative study.