A Novel Machine Learning Approach to PTEN Missense Variant Classification Using Alpha Fold 3
Abstract
Yash Jayesh Laddha*, Arwen Shah and Shubh Jayesh Laddha
PTEN is among the most commonly mutated tumor suppressor genes across human cancers. Yet, hundreds of its missense variants remain classified as variants of uncertain significance (VUS), limiting clinicians' ability to assess cancer risk. Existing predictors rely mainly on sequence conservation and cannot evaluate the three-dimensional structural changes that influence PTEN function, leaving a significant gap in variant classification. This study aimed to determine whether structural changes caused by PTEN missense mutations could reliably distinguish cancer-associated variants from benign ones. All known PTEN missense variants were collected from UniProt, and structural models were generated using Alpha Fold 3 for the wild type and 1,514 mutant sequences. After aligning each mutant to the wild-type structure, seventeen structural features were extracted using PyMOL, Bio python, and MDTraj, including secondary-structure shifts, hydrophobicity changes, and electrostatic differences. These features were used to create a dataset for training Random Forest, XGBoost, logistic regression, and decision-tree classifiers. The models performed well, with Random Forest and XGBoost achieving ROC-AUC scores of 0.985 and 0.983, respectively, and reached high recall and precision for the cancer-associated class, showing strong sensitivity and reliability in identifying pathogenic variants. These displayed significantly higher local RMSD, hydrophobicity change, and electrostatic disruption, reflecting well-known PTEN destabilization mechanisms. It also provided predictions for more than 1,300 VUS, offering a tool for prioritizing high-risk mutations. This work introduces one of the first comprehensive structural frameworks for PTEN variants. It demonstrates that integrating Alpha Fold-based modeling with machine learning can create accurate, clinically relevant interpretations of PTEN mutations.
