Screen: 21
Birjoo Vaishnav, PhD, DABR
Mayo Clinic College of Medicine and Science Rochester
Baltimore, MD
Materials/Methods: Data from the catheter insertion and planning during HDR prostate cases such as volume of prostate in the ultrasound and CT, number of catheters are available during insertion and after digitization and optimization the D10% for the Urethra, V150 and V200 for the prostate for a V100 ~ 95% is obtained. This was then used to train an AutoML model. AutoML is a subset of machine learning which automates the model validation and evaluation to maximize prediction accuracy for one target variable based on many input feature variables. Using various preset criteria, models are trained on data using fivefold cross validation and a portion of data is held for future testing as a holdout. The user interface for a free trial version of a commercially available platform was used and it was found to be easily deployable, including access to the underlying competing algorithms from which one is chosen as the best predictor based on various statistical criteria. AutoML was trained and deployed on a set of 52 rows with four of the predictive features - TRUS volume of prostate, number of slices, CT volume and the number of catheters. Target prediction was for urethra. Another set of 52 rows of features was used as a test model, with the target column of the urethral dose being withheld for testing.
Results: While it was easy to deploy and create a model with this platform, the size of data was only 52 rows and may be a factor in limiting the accuracy. The outputs for the test data were evaluated relative to the ground truth and the elastic net had the least deviation from the ground truth both in terms of the overall data spread and the deviation from ground truth values. The average deviation of the predicted value for D10 urethra from the ground truth was about 1.18 with a standard deviation of 1.34. The range of variation of the ground truth numbers was much more than the range of variation of the prediction indicating a possible bias. The data input had the urethra values rounded off to the nearest integer and may have contributed to the inaccuracy.
Conclusion: AutoML was trained on a small dataset with 52 rows with four features and tested using another set of 52 rows of features with the target values hidden. The mean predicted values of the D10U were close to the ground truth and the algorithm picked by AutoML did have the least deviation from it. Larger and more accurate set of data may be needed to train and eliminate biases. With a larger dataset a reliable AI based clinically applicable nomogram for HDR prostate brachytherapy may be constructed.