CoCoNUT
Computational Comparative GeNomics Utilities Toolkit

2008

This is the web-site of CoCoNUT, software tool for versatile comparative genomics tasks. In the user manual and tutorial, there are many examples and a tutorial on how to use CoCoNUT for efficient genome comparison. Here we outline the most important features of CoCoNUT.

Features of CoCoNUT

CoCoNUT is a software tool for performing the following comparative genomics tasks:
  1. Global alignment for two or multiple whole genomes.
  2. Finding regions of high similarity (candidate regions of conserved synteny) among two or multiple genomes.
  3. Comparison of a draft genomes (a draft genome is not a single string but is a set of strings called contigs) to finished or to other draft genome (the current version is limited to at most 2 draft genomes).
  4. cDNA/EST mapping.
  5. Repeat analysis and detection of large segmental duplications.
CoCoNUT is based on the anchor-based strategy that is composed of three phases:
  1. Computation of fragments.
  2. Computation of highest-scoring chains of colinear fragments. The fragments of these chains compose the set of (the anchors).
  3. Alignment of the regions between the fragments of a chain by applying the same method recursively with less stringent parameters or by using standard dynamic programming.
This strategy can be used for solving the aforementioned tasks. For example, if genomes of closely-related organisms are compared, where there are no (or few) genome rearrangements, then in the second phase CoCoNUT can be used to compute an optimal global chain of colinear non-overlapping fragments. If genomes of distantly-related organisms are compared, where rearrangement events are very likely to take place, then CoCoNUT can be used for computing a set of significant local chains. Each local chain represents a region of high similarity among the genomes in comparison. It is interesting to see that CoCoNUT extends the program MGA in computing local alignments, and its ability to handle forward and reverse strands. Moreover, CoCoNUT includes the following post-processing capabilities:
  1. Interactive visualization of comparison results using a Java-based program called VisCHAINER.
  2. Detection of syntenic regions and reporting these sets as permutations for studying genome rearrangements.
  3. Clustering cDNAs for detecting alternative splices and repeated genes.
  4. Assembling draft genomes, by comparison to finished genomes (under development).
For more details, see the user manual and tutorial.

Efficient algorithms and data structures

CoCoNUT is based on algorithms and data structures optimized to handle large datasets:
  • It uses the Vmatch package, which is based on the enhanced suffix array, for generating the fragments.
  • It uses the program CHAINER, which is based on techniques from computational geometry, for computing chains specific to the comparative genomics task at hand.
  • Other programs are implemented to post-process the resulting chains. This post-processing depends on the task carried out, and it includes, among others, computing alignment on the nucleotide level, finding syntenic regions, and visualizing the results.
  • Availability

    CoCoNUT is free for academic research, educational and demonstration purposes.
    1. Download CoCoNUT: Please send the The CoCoNUT-license agreement to the author in order to obtain the download link.
      Note that you need to have a license agreement for Vmatch also (in case you do not have it) to obtain its binaries.

    2. Download the Visualization Tool ViCHAINER: VisCHAINER Webpage
    For commercial license, please directly contact the author.

    CoCoNUT is available for the following platforms:

    The standard version of CoCoNUT is compiled in 32-bit mode. For large server class machines (e.g., SUN-Sparc/Solaris) CoCoNUT can be compiled in 64-bit mode.

    If you need CoCoNUT for an additional platform, please contact the author.

    Formats and Usage

    Please see the user manual and tutorial for details.

    Test data

    Here, you can download the test data (size is about 36 Mb) needed to run the examples of the tutorial.

    Developer

    1. Mohamed Ibrahim Abouelhoda, Previousely in Dept. of Bioinformatics, University of Ulm, Germany.

    Important Documents

    The CoCoNUT-manual

    The CoCoNUT-license agreement form

    External Packages

  • Fragment generation tool: We recommend to use the Vmatch and the Multimat/Ramaco program. However, CoCoNUT can use any kind of fragments as long as they are given in the correct input format; this require you re-edit some lines in the program.
  • Perl
  • Gnuplot: (optional) for producing postscript images of the comparison results.
  • Java: (optional) for running the interactive visualization tool VisCHAINER.
  • Comments and Bugs

    Please, send your comments and suggestions to the authors.

    Acknowledgment

    CoCoNUT is part of the DFG-Projekt: Entwicklung eines Software-Systems zum multiplen Genomvergleich supported by the DFG-grant Oh 54/4-1.

    My thanks to Enno Ohlebusch, Stefan Kurtz , Janina Reeder, and Kathrin Hockel, for their help and useful suggestions.

    Bibliography

    1. Mohamed I. Abouelhoda, Stefan Kurtz, Enno Ohlebusch
      CoCoNUT: an efficient system for the comparison and analysis of genomes
      BMC Bioinformatics, 9:476, 2008.

    2. Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
      CHAINER: Software for Comparing Genomes.
      In 12th International Conference on Intelligent Systems for Molecular Biology/3rd European Conference on Computational Biology.

    3. Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
      Chaining Algorithms for Multiple Genome Comparison
      Journal of Discrete Algorithms, to appear.

    4. Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
      A Local Chaining Algorithm and its Applications in Comparative Genomics
      Proceedings of the 3rd Workshop on Algorithms in Bioinformatics, pages 1-16, LNBI 2812 , 2003. � Springer-Verlag

    5. Mohamed Ibrahim Abouelhoda, Stefan Kurtz, Enno Ohlebusch
      Replacing Suffix Trees with Enhanced Suffix Arrays
      Journal of Discrete Algorithms, 2(1):53-86, 2004.