QServer: QUalitative BIClustering server

Release 2.0.0, on December 20, 2011

DNA microarray is a multiplex technology used in molecular biology and in medicine. They provide a powerful means for probing the functional states of a cell population by allowing simultaneous observation of mRNA expression patterns of all their genes collected over time and/or under different experimental conditions. DNA microarrays can be used to measure changes in expression levels, so that one can possibly derive information about genes associated with a particular cellular condition even specific biochemical pathways.

A popular way to visualize microarray data for gene expression analyses is to represent the dataset as a matrix with rows representing the genes and columns representing the conditions (or the other way around) with each element of the matrix representing the relative mRNA abundance of a gene under a specific condition.

To analyze the complex microarray data, numerous computational tools have been developed. Among them, clustering of genes based on their similar expression patterns (co-expressed genes) using (traditional) clustering strategies represents one of the most popular approaches to microarray data analyses.

The traditional clustering techniques attempt to, in the context of microarray data analyses, partition a set of genes into "clusters" with similar expression patterns under specified conditions, or identify such clusters from an otherwise unstructured microarray dataset. While useful, such clustering algorithms are known to be inadequate for handling the general gene-expression analyses problems, that often need to identify co-expressed genes under some (to-be-identified) conditions in contrast to finding co-expressed genes under all given conditions.

Biclustering (co-clustering) is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. It is Cheng and Church who firstly introduced the concept of direct clustering, originally proposed by Hartigan, to the field of gene expression data analyses, and referred it as biclustering which extends the traditional clustering techniques, that is to find subsets of conditions under which some (to be identified) subsets of genes have similar expression patterns. Each such submatrix is called a bicluster.

Various algorithms have been developed to attempt to solve the biclustering problem, such as BIMAX, ISA, SAMBA, RMSBE, BOLBOA, NNN and BUBBLE.

We recently developed a highly effective biclustering algorithm, QUalitative BIClustering algorithm (QUBIC) (1). This server, QServer, has employed the QUBIC algorithm and a number of functional characterizations of each bicluster of genes. We believe that QServer will be an easy-to-use and hypothesis-intriguing platform for genomics researchers.

[1] G Li, Q Ma, H Tang, AH Paterson and Y Xu, "QUBIC: a qualitative biclustering algorithm for analyses of gene expression data", Nucleic Acids Research 2009 37:e101. PubMed.