A heuristic algorithm solving the mutual-exclusivity sorting problem
Abstract:
Binary (or boolean) matrices provide a common effective data representation adopted in several domains of computational biology, especially for investigating cancer and other human diseases. For instance, they are used to summarise genetic aberrations—copy number alterations or mutations—observed in cancer patient cohorts, effectively highlighting combinatorial relations among them. One of these is the tendency for two or more genes to not be co-mutated in the same sample or patient, i.e. a mutual exclusivity trend. Exploiting this principle has allowed identifying new cancer driver protein-interaction networks and has been proposed to design effective combinatorial anti-cancer therapies rationally. Several tools exist to identify and statistically assess mutual-exclusive cancer-driver genomic events. However, these tools need to be equipped with robust/efficient methods to sort rows and columns of a binary matrix to visually highlight possible mutual exclusivity trends.
Here, we forMalise the mutual-exclusivity-sorting problem and present MutExMatSorting: an R package implementing a computationally efficient algorithm able to sort rows and columns of a binary matrix to highlight mutual exclusivity patterns. Particularly, our algorithm minimises the extent of collective vertical overlap between consecutive non-zero entries across rows while maximising the number of adjacent non-zero entries in the same row. Here, we demonstrate that existing tools for mutual exclusivity analysis are suboptimal according to this criteria and are outperformed by MutExMatSorting.
Supplementary data are available at Bioinformatics online.