Home

>

Publications

>

AACR 2025

April 30, 2025

AACR 2025

Detection of early stage colorectal cancer using cell-free oncRNA biomarkers and AI

Amir Momen-Roknabadi1, Mehran Karimzadeh1, Nae-Chyun Chen1, Taylor B. Cavazos1, Jieyang Wang1, Jeremy Ku1, Alex Degtiar1, Akshaya Krishnan1, Martha Hernandez1, Alice Huang1, Selina Chen1, Dang Nguyen1, Ti Lam1, Rose Hanna1, Lisa Fish1, Magdalena Gebala1, Alexx J. Smith1, Sukh Sekhon1, Jennifer Yen1, Jeff Gregg2, Hani Goodarzi3,4, Helen Li1, Fereydoun Hormozdiari5, Babak Behsaz1 , Anna Hartwig1, Lee Schwartzberg1,6, Babak Alipanahi1

1Exai Bio Inc., Palo Alto, CA; 2University of Nevada School of Medicine, Reno, Nevada; 3University of California San Francisco, San Francisco CA, 4Arc Institute, Palo Alto, CA; 5University of California, Davis, CA, US; 6University of Nevada, Reno, Nevada

Background

  • Epidemiology & Need:
    Colorectal Cancer (CRC) is the second leading cause of cancer-related deaths globally.1
    Early detection is critical to improve outcomes.
  • Current Challenges:
    Existing screening methods (colonoscopy, stool tests) have limitations in adherence.
    Commercial blood-based tests have limited early detection sensitivity and may require large blood volumes
  • Innovation Introduced:
    Development of a blood-based test that leverages a novel class of small orphan non-coding RNAs (oncRNAs) combined with generative AI modeling.

Methods

  • Biomarker Discovery:
    Utilized TCGA small RNA profiles to identify a library of pan-cancer oncRNAs enriched in tumors.
  • Sample Cohorts:
    Training Cohort: 613 samples (388 CRC, 225 controls) from multiple sources.
    Validation Cohort: 192 samples (113 CRC, 79 controls) from a distinct supplier.
  • Plasma Processing & Sequencing:
    1 mL plasma per sample processed using an automated cell-free RNA workflow.
    Sequencing depth: ~58 million 100-bp single-end reads.
  • AI Model Development:
    A generative ensemble AI model trained using 5-fold cross-validation (Orion)2.
    Key performance metrics (e.g., AUROC, sensitivity at 90% specificity) computed.
Figure 2. Predicted ER and HER2 scores by tissue IHC and mRNA status.

Figure 1. Model’s oncRNAs show differential abundance in CRC tissue vs. other cancer and control tissue samples.

The oncRNAs used in the model exhibit higher abundance in TCGA CRC cancer samples compared to control samples and other cancer types. The oncRNA burden is defined as the log1p-transformed CPM values. A Mann-Whitney U test confirms a significant difference (p < 0.01) in oncRNA abundance between CRC and other sample types, with lung cancer exhibiting the second highest levels.

Figure 3. ER and HER2 signal is consistent in cell lines and paired conditioned media.

Figure 2. Overview of CRC detection study.

A total of 817 participants were included based on inclusion and exclusion criteria, excluding individuals with potential confounding factors such as prior cancer therapy, immune-modulating treatments, or recent surgery. Of these, 192 participants from a distinct supplier were reserved as the independent validation set, while the training set was used to train and lock the model. The locked model was subsequently applied to the independent validation set.

Overall Results

  • Model Performance in Discriminating Cancer from Control Samples:
    Training AUROC: 0.93 (95% CI: 0.92–0.95)
    Validation AUROC: 0.95 (95% CI: 0.93–0.98)
    Sensitivity at 90% Specificity:
    - Training sensitivity: 81.7% (95% CI: 77.5%–85.4%)
    - Validation sensitivity: 88.5% (95% CI: 81.1%–93.7%)
  • Early-Stage Detection:
    Stage I sensitivity: 72.1% (95% CI: 59.9%–82.3%) in training, 80.0% (95% CI: 51.9%–95.7%) in validation.
  • Additional Findings:
    Model scores increase with tumor burden (T-stage, N-stage).
    Consistent performance across the training and the independent validation set.
    Enhanced early detection capability.
Figure 2. Predicted ER and HER2 scores by tissue IHC and mRNA status.

Figure 3. Training set cross-validated performance.

a) Left panel: ROC curve showing cross-validated performance on the training set, averaged across folds. Middle panel: Stage-wise sensitivity at 90% specificity. Right panel: Sensitivity at 90% specificity across different T-stages. b) Model scores generally increase with the overall stage (left panel) and T-stage (middle panel) and N-stage (right panel) in the training set.

Figure 3. ER and HER2 signal is consistent in cell lines and paired conditioned media.

Figure 4.Validation set performance.

a) Left panel: ROC curve showing test performance on the validation set. Middle panel: Stage-wise sensitivity at 90% specificity. Right panel: Sensitivity at 90% specificity across different T-stages. b) Model scores increase with the overall stage (left panel) and T-stage (middle panel) and N-stage (right panel) in the validation set.

Conclusions

  • A novel small RNA biomarker, oncRNA, combined with generative AI detects early-stage CRC using only 1 ml plasma.
  • High sensitivity and specificity were achieved, including 80% sensitivity for stage I CRC.
  • Blood-based oncRNA has promising potential to improve CRC screening rates and adherence.

Disclosures:

AM, MK, NC, TBC, JW, JK, AD, AK, MH, AHu, SC, DN, TL, RH, LF, MG, AJS, SS, JY, HL, BB, AH, and LS are full-time employees of Exai Bio. BA is a co-founder, stockholder, and full-time employee of Exai Bio. HG is a co-founder, stockholder, and advisor of Exai Bio.  JG and FH are advisors to Exai Bio Inc.

References:

  1. Sung H, et al., Cancer J Clin. 2021
  2. Karimzadeh M. , et al., Nature Communications. 2024
Close the cookie popup
Cookie Settings
By clicking "Accept All", you are agreeing to store cookies on your device to enhance your experience and help Exai's marketing.
Accept All
cookie settings