ML Predictive Genes
Mean Germline dN/dS
% Under Purifying Selection
Somatic Genes Tested
Median Somatic dN/dS
Final Candidates
— candidate genes identified

🧬 Germline Conservation

â„šī¸
Germline dN/dS < 0.3 indicates strong purifying selection across vertebrate evolution, meaning the gene is functionally essential and intolerant of amino-acid changes.
📊
Interpretation: Most ML-predictive genes (≈80%) are under purifying selection (dN/dS < 0.3), consistent with essential gene function. However, predictive genes show a slightly higher mean dN/dS (0.234) than background genes (0.203) — they are marginally less conserved on average. This difference is statistically significant but biologically modest, and does not undermine the finding that the majority remain under strong purifying constraint.
Gene Group dN/dS Selection Type Mouse %ID

🔬 Somatic Selection

â„šī¸
Somatic dN/dS ≥ 1.5 with FDR < 0.05 indicates positive selection for non-synonymous mutations within tumours, characteristic of cancer driver genes.
â„šī¸ Note: Somatic dN/dS is estimated using a simplified binomial test. This approach over-estimates the number of genes under positive selection compared to covariate-adjusted methods (dNdScv; Martincorena et al., 2017). The large number of nominally significant genes reflects background mutation rate heterogeneity, not genuine positive selection. Only genes passing ALL three filters (ML-predictive + germline conserved + somatic dN/dS ≥ 1.5 with FDR < 0.05) are reported as candidates.
Gene n_nonsyn n_syn dN/dS CI Low CI High FDR q

Germline vs. Somatic dN/dS

🌍 Multi-Species Conservation

Evolutionary Filtering Funnel

Progressive filtering through evolutionary constraints. See the full pipeline story for the complete analysis funnel.

âš ī¸
Methodological Notes
  • Germline dN/dS values are sourced from Ensembl Compara (pre-computed ortholog alignments), not calculated de novo. This provides robust estimates but relies on the Ensembl gene annotation version.
  • Somatic dN/dS is calculated using a simplified binomial exact test comparing observed nonsynonymous mutations to expectation under neutrality (expected nonsynonymous proportion ≈ 0.74). This is conceptually inspired by the dNdScv framework (Martincorena et al., Cell 2017) but does not use the full dNdScv negative-binomial model or gene-level covariates. Full dNdScv implementation is planned for a future update.
  • Genes with zero synonymous mutations (S=0) produce infinite dN/dS. These are flagged but retained if they are established cancer drivers.