QServer: QUalitative BIClustering server


Release 2.0.0, on December 20, 2011

Supplementary materials of this paper:

QUBIC: a qualitative biclustering algorithm for analyses of gene expression data.
Guojun Li, Qin Ma, Haibao Tang, Andrew H Paterson, Ying Xu.
Nucleic Acids Research, 2009 37(15):e101.

Go back to the server.

Programs
  • QUBIC version 0.1 source code: qubic0.1.tar.gz and manual: qubic0.1.pdf .
  • Data sets
  • Synthetic data [Prelic et al (2006)]: http://www.tik.ee.ethz.ch/sop/bimax/ .
  • Synthetic data: Scaling pattern and overlap pattern generated by ourselves: bench mark data and a large data set.
  • M3D database: http://m3d.bu.edu/cgi-bin/web/array/index.pl?section=home .
  • Yeast data [Prelic et al (2006)]: http://www.tik.ee.ethz.ch/sop/bimax/ . (The subset can be found here.)
  • Leukemia data [Armstrong et al (2002)] can be found here.
       We pre-process the original data by changing letters and negative PM-MM values to '0', and discrete the significant value to characteristic symbol. Original data, filtered version, and the pre-processing script.
  • Results
    In addition, scripts to validate the enrichment of functional classifications for E. coli and yeast data can be found here. There are following utility scripts in the package.
  • evaluate.py - run it e.g. $ python evaluate.py qubic.ecoli ecoli kegg will produce qubic.ecoli.kegg, which contains the most significant KEGG assignment in E. coli for the biclusters found in qubic.ecoli.
  • stats.py - simply run as $ python stats.py will output the significance statistics for all the produced files in the current directory.
  • makefile - type $ make qubic to benchmark only qubic or simply $ make to run all tests.



  • The prediction results are listed here:

      Prelic's benchmark QUBIC benchmark E coli data Yeast data Leukemia data Table 10
    QUBIC YES YES YES YES YES YES
    Others YES YES YES YES    

    The comparison results on Prelic's benchmark, QUBIC benchmark, E. coli and yeast datasets are summarized in the following three EXCEL sheets.
  • The benchmark files
  • Prelic's benchmark
  • QUBIC benchmark
  • E. coli and yeast
  • Running parameters of biclustering programs
      On small data sets On large data sets
    QUBIC o=100 c=0.95 q=0.06 r=1 f=1 k= 5% of columns (default setting) discretize to three classes (up, down, no-change) after pre-process
    BIMAX (minimum gene and chip number)=2 N/A
    ISA (threshold genes and chips)=2, (seeds number)=100 N/A
    SAMBA overlap factor 0.1, 100 probes to hash, kernel 4-4 N/A
    RMSBE alpha=.4, beta=.5, gamma=1.2, random 10 genes and 10 chips random 300 genes and 40 chips
    Contacts for technical details

    Qin Ma: maqin@csbl.bmb.uga.edu
    Haibao Tang: bao@uga.edu