PreGoLoF: Predict the Gain or Loss of Functions

Release 1.0


Preliminary results:
We have implemented and applied this procedure to a number of mutated genes. The predicted functional changes due to p53 and APC mutations in 219 colorectal adenocarcinoma samples are outlined, respectively, below.
Functional changes due to APC mutations in colorectal adenocarcinoma tissues. A bi-clustering analysis was conducted on the transcriptomic data of all 219 colon cancer samples in TCGA, 159 of them having APC mutations. Ten bi-clusters were detected, each representing a subset of samples with specific functional change(s) associated with APC mutations. One bi-cluster enriched with concurrent mutations in exons 14 and 16 of the gene has considerable functional changes in innate immune responses as shown in Figure 1, 2 and 3. Survival data revealed that patients in this bi-cluster have substantially lower survival rates than the other colon-cancer patients with APC mutations. Knowing that APC plays a key role in innate immunity, this analysis suggests that the loss of immune-related functions by APC mutations may be a novel reason for cancer development in colon, in addition to altered β-catenin and E-cadherin functions as widely speculated in the cancer literature [1]. 
Functional changes due to p53 mutations in colon cancer. Of the same set of colon cancer samples, 121 have p53 mutations. A few dozen bi-clusters were detected to be associated with p53 mutations. Specifically, cluster #1 enriches the following down-regulated pathways: mitochondrial lumen, membrane enclosed lumen, organelle lumen, and nuclear lumen. This is clearly consistent with published data that over-expressed p53 increases lumen sizes [2]. Cluster #2 enriches a large number of immune pathways, predominantly innate immunity functions, which is consistent with a recent publication in Science [3]. Cluster #3 enriches apoptosis related down-regulated pathways, which corresponds to the widely studied functional loss by p53 mutations. Cluster #4 enriches cytoskeletal reorganization related pathways, which is consistent with a recent study showing that p53 can regulate cytoskeletal reorganization [4].

Figure 1. The bar plots shows the proportion of the concurrent mutations and other mutations in the identified bi-cluster (left) versus other samples (right). The survival curves on the right show the patients with the concurrent mutations have significantly (p < 0.005) lower survival rates.

Figure 2. Distribution of APC mutations of the samples in the bicluster versus the other samples. The x-axis and y-axis show the mutation position and samples and the mutations in the samples in and not in the identified bi-cluster are colored by red and green, respectively. A significant enrichment of the concurrent mutations on exon 14 and 15 and exon 16 can be found for the identified bi-cluster.

Figure 3. Expression profile of the genes show functional changes led by the concurrent mutations.

Mutation patterns considered in the analysis
Currently, we have considered the following mutation patterns and their combinations that may possibly lead to a heterogeneous functional gain or loss: 1) mutations that are significantly enriched on a certain exon or several exons, 2) mutations that are significantly enriched on one or several mutation sites, 3) mutations that are significantly enriched on one or several functional regions on the protein tertiary structure, 4) mutations that are significantly enriched on one or several mutation types, 5) con-current mutations, and 6) collect effect of two or multiple mutations.

Mutations and cancer types selected for study:
We have applied our method on the TCGA data of 20 cancer types to identify possible GoLoF and interactive effect of multiple mutations for 14 known cancer associated genes and 4 frequently mutated genes among the examined samples. The selected cancer types and mutations are plotted in Figure 3. We will extend the analysis to infer GoLoF for the other 20-30 cancer associated or frequently observed mutations in the 20 cancer types.

Figure 3. Analyzed mutations and cancer types. The column color bar reflect the quality level (dark green of high quality and light green for low quality) of the mutation profile of each cancer type as defined in Method part.