%0 Journal Article %A Rahman, SA %A Walker, RC %A Lloyd, MA %A Grace, BL %A van Boxel, GI %A Kingma, BF %A Ruurda, JP %A van Hillegersberg, R %A Harris, S %A Parsons, S %A Mercer, S %A Griffiths, EA %A O'Neill, JR %A Turkington, R %A Fitzgerald, RC %A Underwood, TJ %A Consortium, OCCAMS %D 2020 %T Machine learning to predict early recurrence after oesophageal cancer surgery %U https://crick.figshare.com/articles/journal_contribution/Machine_learning_to_predict_early_recurrence_after_oesophageal_cancer_surgery/12571238 %2 https://crick.figshare.com/ndownloader/files/23454308 %K OCCAMS Consortium %K Ciccarelli - sec %K Surgery %K 11 Medical and Health Sciences %X BACKGROUND: Early cancer recurrence after oesophagectomy is a common problem, with an incidence of 20-30 per cent despite the widespread use of neoadjuvant treatment. Quantification of this risk is difficult and existing models perform poorly. This study aimed to develop a predictive model for early recurrence after surgery for oesophageal adenocarcinoma using a large multinational cohort and machine learning approaches. METHODS: Consecutive patients who underwent oesophagectomy for adenocarcinoma and had neoadjuvant treatment in one Dutch and six UK oesophagogastric units were analysed. Using clinical characteristics and postoperative histopathology, models were generated using elastic net regression (ELR) and the machine learning methods random forest (RF) and extreme gradient boosting (XGB). Finally, a combined (ensemble) model of these was generated. The relative importance of factors to outcome was calculated as a percentage contribution to the model. RESULTS: A total of 812 patients were included. The recurrence rate at less than 1 year was 29·1 per cent. All of the models demonstrated good discrimination. Internally validated areas under the receiver operating characteristic (ROC) curve (AUCs) were similar, with the ensemble model performing best (AUC 0·791 for ELR, 0·801 for RF, 0·804 for XGB, 0·805 for ensemble). Performance was similar when internal-external validation was used (validation across sites, AUC 0·804 for ensemble). In the final model, the most important variables were number of positive lymph nodes (25·7 per cent) and lymphovascular invasion (16·9 per cent). CONCLUSION: The model derived using machine learning approaches and an international data set provided excellent performance in quantifying the risk of early recurrence after surgery, and will be useful in prognostication for clinicians and patients. %I The Francis Crick Institute