Diagnostics of ovarian cancer via metabolite analysis and machine learning.

Document Type


Publication Date


Publication Title

Integr Biol (Camb)


california; santa monica; pni


Ovarian cancer (OC) is the second most common cancer of the female reproductive system. Due to the asymptomatic nature of early stages of OC and an increasingly poor prognosis in later stages, methods of screening for OC are much desired. Furthermore, screening and diagnosis processes, in order to justify use on asymptomatic patients, must be convenient and non-invasive. Recent developments in machine-learning technologies have made this possible via techniques in the field of metabolomics. The objective of this research was to use existing metabolomics data on OC and various analytic methods to develop a machine-learning model for the classification of potentially OC-related metabolite biomarkers. Pathway analysis and metabolite-set enrichment analysis were performed on gathered metabolite sets. Quantitative molecular descriptors were then used with various machine-learning classifiers for the diagnostics of OC using related metabolites. We elucidated that the metabolites associated with OC used for machine-learning models are involved in five metabolic pathways linked to OC: Nicotinate and Nicotinamide Metabolism, Glycolysis/Gluconeogenesis, Aminoacyl-tRNA Biosynthesis, Valine, Leucine and Isoleucine Biosynthesis, and Alanine, Aspartate and Glutamate Metabolism. Several classification models for the identification of OC using related metabolites were created and their accuracies were confirmed through testing with 10-fold cross-validation. The most accurate model was able to achieve 85.29% accuracy. The elucidation of biological pathways specific to OC using metabolic data and the observation of changes in these pathways in patients have the potential to contribute to the development of screening techniques for OC. Our results demonstrate the possibility of development of the machine-learning models for OC diagnostics using metabolomics data.

Clinical Institute


Clinical Institute

Women & Children

Clinical Institute

Neurosciences (Brain & Spine)




Obstetrics & Gynecology