Repository of mutational features

Linear clusters

Linear clusters for each gene and cohort were identified by OncodriveCLUSTL. We defined as significant those clusters in a driver gene with a p-value lower than 0.05. The start and end of the clusters were retrieved from the first and last mutated amino acid overlapping the cluster, respectively.

3D clusters

Information about the positions involved in the 3D clusters defined by HotMAPS were retrieved from the gene specific output of each cohort. We defined as significant those amino acids in a driver gene with a q-value lower than 0.05.

Pfam Domains

Pfam domains for each driver gene and cohort were identified by smRegions. We defined as significant those domains in driver genes with a q-value lower than 0.1 and with positive log ratio of observed-to-simulated mutations (observed mutations / simulated mutations > 1). The first and last amino acid are defined from the start and end of the Pfam domain, respectively.

Excess of mutations

The so-called excess of mutations for a given coding consequence-type quantifies the proportion of observed mutations at this consequence-type that are not explained by the neutral mutation rate. The excess is inferred from the dN/dS estimate \(\omega\) as \((\omega - 1)\ /\ \omega\). We computed the excess for missense, nonsense and splicing-affecting mutations.

Mode of action

We computed the gene-specific dN/dS estimates for nonsense and missense mutations, denoted \(\omega_{\text{non}}\) and \(\omega_{\text{mis}}\). Then each gene induces a point in the plane with coordinates \((\omega_{\text{non}},\ \omega_{\text{mis}})\). We deemed a gene Act (resp. LoF) if its corresponding point sits above (resp. below) the diagonal (\(x = y\)) up to an uncertainty threshold of 0.1. Genes within the uncertainty area as well as genes with \(\omega_{\text{non}} < \ 1\) and \(\omega_{\text{mis}} < \ 1\) were deemed “ambiguous”.