AGCoL Using MUMmer from SyMAP UA
BIO5
SyMAP Home | Download | Docs | Input | System Guide | User Guide | Tour
This document discusses how to deal with one or more MUMmer alignments failing. This document refers to release v5.0.8 and later. The corresponding MUMmer documentation is v3 and v4.

Contents

Executables

Go to top
The alignment programs are provided in the symap/ext directory. There are executables for 64-bit Linux and 64-bit MacOS. SyMAP will select the correct directory for the machine you are running from, i.e. you do not need to do anything. When SyMAP creates a database, it (1) checks the MySQL variables, and (2) checks that the external programs are executable. If you see a message like:
  ***Error - file is not executable: ext/mummer/mac/promer
Execute:
  > chmod 755 ext/mummer/mac/promer
Execute the program from the command line to make sure it works on you machine, e.g.
  >./ext/mummer/mac/promer

    USAGE: promer  [options]  <Reference>  <Query>

    Try './ext/mummer/mac/promer -h' for more information.
The above shows that the promer code will execute on my MacOS.

MUMmer output from SyMAP

Go to top
The resulting alignment files are in:
   /data/seq_results/<project-name1>-to-<project-name2>/align
For example,
   data/seq_results/demo_seq_to_demo_seq2/align> ls
   all.done				demo_seq_cc.demo_seq2_f2.mum
   demo_seq_cc.demo_seq2_f1.mum		demo_seq_cc.demo_seq2_f2.mum.done
   demo_seq_cc.demo_seq2_f1.mum.done
All MUMmer files but the ".mum" are removed by symap; if you prefer them not to be removed, start symap with the "-mum" command line parameter, i.e.
   ./symap -mum
and then run the alignments.

The log files are in the /logs/<project-name1>-to-<project-name2> directory.

After the database is complete, these can be removed. However, sometimes SyMAP version updates require the project files to be reloaded and/or the synteny to be recomputed; if these files remain, the existing MUMmer files will be used, which saves a lot of time.

Finding the problem

Go to top
If the MUMmer alignment fails, inspect the log files. The log files are as follows, where
  p1 = project-name1 and p2 = project-name2 of the analyzed pair:
   symap_5/
     error.log   # a SyMAP error will write its trace data into this file and list failed MUMmer

     logs/
       <dbName>_LOAD.log     # keeps track of data loaded or removed from database.
       <p1>-to-<p2>/         # one directory per project-to-project alignment
          <p1_cc.p2_f1>.log  # MUMmer terminal output - one file per MUMmer process
          <p1_cc.p2_f2>.log  #    fn is n=1,2... for number of processes, e.g f2 is 2nd process
          symap.log          # keeps most of the SyMAP output shown on the terminal for this A&S

     e.g. p1 = demo_seq, p2 = demo_seq2
       symapDemo_load.log
       demo_seq_to_demo_seq2/
         demo_seq_cc.demo_seq2_f1.log  # MUMmer terminal output
         demo_seq_cc.demo_seq2_f2.log  # MUMmer terminal output
         symap.log                     # SyMAP terminal output
→ If an alignment is listed as failed in the error.log file, the corresponding <p1_cc.p2_fn>.log file will contain the MUMmer error.

→ If the error is not found in the log files or it is not clear, try the following:
Copy the command from the terminal (or log file), and paste it on a new terminal line to execute, e.g.

  ext/mummer/mac/promer -p data/seq_results/demo_seq_to_demo_seq2/align/demo_seq_cc.demo_seq2_f2.promer
  data/seq_results/demo_seq_to_demo_seq2/tmp/demo_seq2/demo_seq2_f2.fa
  data/seq_results/demo_seq_to_demo_seq2/tmp/demo_seq/demo_seq_cc.fa
This shows promer output directly on the terminal.

Getting Help

Go to top
If none of these suggestions in this document fix your problem, email symap@agcol.arizona.edu with the following files (described above):
  1. error.log
  2. logs/<p1>-to-<p2>/symap.log
  3. logs/<p1>-to-<p2>/<p1_cc.p2.fn>.log where n is the process number
  4. Any output to the terminal (either copy and paste into the email, or send a screen capture)
For example, email the terminal output and:
  symap/error.log
  symap/logs/demo_seq_to_demo_seq2/symap.log
  symap/logs/demo_seq_to_demo_seq2/demo_seq_cc.demo_seq2_f2.log

Out of memory

A MUMmer failure is typically from insufficient memory. The following error typically indicates a memory problem.
  Alignment program error code: 141
  20220512|075853|6007| ERROR: mummer and/or mgaps returned non-zero, please file a bug report
or
#..........................ERROR: mummer and/or mgaps returned non-zero, please file a bug report
Alignment program error code: 1
The error code will appear on the terminal, but not in the symap/logs/<..>/symap.log file, but it should appear in the MUMmer log file (read the above section for the file containing the MUMmer terminal output).

Limit CPUs and uncheck Concat

Go to top
There is no straight-forward way to know if you have enough memory as it depends on the size and complexity of the two genomes being compared. If you think memory may be tight, you can limit the amount used at one time by limiting the number of CPUs, as each CPU uses a considerable amount of memory (e.g. 4 CPUs could collectively use 20GB of memory at once).

concat

See Concat for a description and timing results.

Not-masked or Soft-masked

A memory problem can occur if the genome sequence is not masked or only soft-masked. Either: (1) change the sequence to hard-masked, or (2) set the SyMAP parameter mask_all_but_genes to yes.

Fails immediately

Go to top
If an alignment fails immediately, and if the last line of the first <alignment>.log file is:
  1: PREPARING DATA
the reason is probably that your machine does not have near enough memory as MUMmer could not even prepare the data. Try it again with unchecked Concat; if that does not work, you need more memory.

One or more fails

Sometimes just one or a few of the alignment processes will fail. You will see a line such as:
  Error: Running command: /Users/cari/Workspace/symap_5/ext/mummer/mac/promer
   -p data/seq_results/demo_seq_to_demo_seq2/align/demo_seq_cc.demo_seq2_f2.promer
   data/seq_results/demo_seq_to_demo_seq2/tmp/demo_seq2/demo_seq2_f2.fa
   data/seq_results/demo_seq_to_demo_seq2/tmp/demo_seq/demo_seq_cc.fa
runDialog You will see the failure on the dialog box as is shown on the left. The remaining processes will continue. When all processes are complete, you will see a "?" for the pair as shown on the below.

    runResults

Select the "?" followed by Selected Pair and it will complete the failed processes.

A preferred way is to set CPUs to a low number if the genomes are large, that way, it will likely finish the first time.

Using MUMmer4 from within SyMAP

Sometimes when MUMmer v3 fails, MUMmer v4 will work. MUMmer4 is included in SyMAP package with a fix to promer (make sure you get v5.0.8 or later). Enter the ext/mummer4 directory and follow the instructions in the README.

My observations on MacOS indicate that MUMmer v4 takes more memory but less time than MUMmer v3.

Running MUMmer from the command line

Go to top
If you need to run MUMmer using the command line from some other machine, do the following:
  • The query must be alphanumerically less than reference. For the promer command:
    USAGE: promer  [options]  <Reference>  <Query>
    e.g.
    promer proj2 proj1
    
  • Execute:
    • For different genomes:
      • Use promer
      • The output of promer is input to "show-coords -dlkT".
    • For Self synteny:
      • Use nucmer.
      • The output of nucmer is input to "show-coords -dlT".

  • Result files:
    • The result files must have suffix ".mum"
    • Put the ".mum" files in the directory data/seq_results/<proj1-to-proj2>/align
    • In the <proj1-to-proj2>/align directory, execute:
      touch all.done
      This creates a file, which indicates to SyMAP that the alignments are done and to process the files in the directory ending with ".mum".

  • When you run Selected Pair, SyMAP will recognize the files and use them to build the synteny blocks.

Example of using mummer for loaded projects demo_seq and demo_seq2: If you try this with demo_seq and demo_seq2, make sure there is no current alignment in the database or disk for your project (e.g. Clear Pair).

Demo_seq has chr3 and chr5, demo_seq2 has chr1 and chr3; these will be compared separately against each other. Demo_seq is alphanumerically less than demo_seq2 , so it is first in the directory name but second in the command arguments (this is how SyMAP knows which project Chr3 belongs to). The commands are as follows:

   cd data/seq_results
   mkdir demo_seq_to_demo_seq2
   mkdir demo_seq_to_demo_seq2/align
   touch demo_seq_to_demo_seq2/align/all.done
   cd ../..
   ext/mummer/mac/promer data/seq/demo_seq2/sequence/chr1.seq data/seq/demo_seq/sequence/chr3.seq
   ext/mummer/mac/show-coords -dlkTH out.delta >chr1chr3.mum
   mv chr1chr3.mum data/seq_results/demo_seq_to_demo_seq2/align
   rm out.delta
This would need to be repeated 3 more times for chr1chr5.mum, chr3chr3.mum, chr3chr5.mum (suggestion: put the commands in a script). The resulting align directory will have:
   all.done	 chr1chr3.mum	chr1chr5.mum	chr3chr3.mum	chr3chr5.mum
When symap is started and demo_seq and demo_seq2 selected, there will be a "A" in their cell; select it followed by Selected Pair and it will load the alignments and compute the synteny.

To view the MUMmer parameters, see MUMmer parameters

MacOS details

Go to top
Executable on MacOS: Any executable that has not been okayed by Apple get the error message
  cannot be opened because the developer cannot be verified
See MacOS External for the fix.

Early versions of MacOS: The MacOS executables were compiled on a MacOS 10.15. They will not work on old versions such as MacOS 10.9. Try the following:

   cd ext/mummer
   mv mac mac_506
   mv mac_pre506 mac
   cd ../muscle
   mv muscle muscle_506
   mv muscle_pre506 muscle
These executables compiled on MacOS 10.9 may work on your Mac.
Email Comments To: symap@agcol.arizona.edu