Identifying novel candidates encoding enzymes/proteins in relation to plant cell wall biosynthesis is essential for plant biologists studying plant cell walls. Many efforts have been conducted for this task by analyzing Microarray data to cluster genes with similar expression patterns in one dimension, i.e. across all experiment conditions that are available. The idea behind is that genes co-expressed with known cell wall-related genes may also be involved in cell wall biosynthesis. However, intuitively genes are not necessarily co-expressed in all microarray experiments.
We adopted the bi-clustering technique which clusters co-expressed genes in two dimensions, i.e. genes as one dimension and experimental conditions as the other. Using the genes in Purdue's Cell Wall Gene Families database as seeds, we bi-clustered Arabidopsis microarray data which includes 22 810 genes and 351 conditions into 141 co-expression networks. From these networks, 217 densely connected network modules were extracted by analyzing the network's topological properties using Molecular Complex Detection (MCODE) algorithm. Known seed genes and transcription factors (TFs) are further located and highlighted in the network module graphs, making it possible to build connections among seed genes, TFs and candidate genes. Manual investigation of selected modules not only confirms many known links between seed genes and TFs, e.g links between TF MYB46 and a number of known enzymes involved in cellulose and lignin synthesis, but also reveals more new connections which may suggest new links in regulatory and metabolic pathways of plant cell wall biosynthesis. Our computational study identified over 2,000 candidate genes together with their residing network modules approachable to interested experimentalists for further study. In addition, the upstream cis-regulatory motifs of genes belonging to a same co-expression module were also predicted. The graphs of co-expression networks and inside modules, the corresponding genes and specifically expression (i.e. anatomy, development), and the conserved regulatory motifs were integrated and deposited into Database of Cell Wall Related Proteins (CWRPdb).