piRNA in Machine-Learning-Based Diagnostics of Colorectal Cancer.
Publication Title
Molecules
Document Type
Article
Publication Date
9-11-2024
Keywords
california; santa monica; pni
Abstract
Objective biomarkers are crucial for early diagnosis to promote treatment and raise survival rates for diseases. With the smallest non-coding RNAs-piwi-RNAs (piRNAs)-and their transcripts, we sought to identify if these piRNAs could be used as biomarkers for colorectal cancer (CRC). Using previously published data from serum samples of patients with CRC, 13 differently expressed piRNAs were selected as potential biomarkers. With this data, we developed a machine learning (ML) algorithm and created 1020 different piRNA sequence descriptors. With the Naïve Bayes Multinomial classifier, we were able to isolate the 27 most influential sequence descriptors and achieve an accuracy of 96.4%. To test the validity of our model, we used data from piRBase with known associations with CRC that we did not use to train the ML model. We were able to achieve an accuracy of 85.7% with these new independent data. To further validate our model, we also tested data from unrelated diseases, including piRNAs with a correlation to breast cancer and no proven correlation to CRC. The model scored 44.4% on these piRNAs, showing that it can identify a difference between biomarkers of CRC and biomarkers of other diseases. The final results show that our model is an effective tool for diagnosing colorectal cancer. We believe that in the future, this model will prove useful for colorectal cancer and other diseases diagnostics.
Area of Special Interest
Digestive Health
Area of Special Interest
Cancer
Specialty/Research Institute
Neurosciences
Specialty/Research Institute
Oncology
Specialty/Research Institute
Gastroenterology
DOI
10.3390/molecules29184311