AGCoL SyMAP Release Notes UA
BIO5
SyMAP Home | Download | Docs | Input | System Guide | User Guide | Tour
For existing databases, there are no database or build changes since v5.4.8; see Update.
This will no longer run on 32-bit machines, starting with v5.5.1.
You must have Java v17.0.11 or later, starting with SyMAP v5.5.0.

Summary of recent major releases

v5.5.7(8-Aug-24)Add xToSymap.
v5.5.6(14-July-24)Improved the collinear set algorithm.
v5.4.8(13-Feb-24)New Cluster Hit algorithm Algo2, which is aware of the exon-intron structure.
v5.4.0(6-Apr-23)Improved Cluster Hit algorithm Algo1 for the detection of gene homology.
v5.3.1(6-Jan-23)Improved hit alignment features.
v5.3.0(10-Dec-22)Updated database schema to remove all FPC tables.

Release: v5.5.9 (24-Sept-24)

This is mostly a documentation update, with a few tiny code updates.
  • SyMAP can be run with a "-mum" flag, which prevents any MUMmer files from being removed by SyMAP after alignment.
  • On creating a database, symap checks the MySQL settings and external programs. It checks java minor version numbers.
  • Developer: remove some obsolete code for orderAgainst, which removed an 'unchecked' call. Write Java version into symap.log.

Release: v5.5.8 (1-Sept-24)

Small changes to xToSymap and some small symap interfaces changes.
  1. xToSymap:
    1. Summarize: Added output:
      • Output all attributes for gene. Compress the summary lengths counts. If not Verbose, remark if the sequence appears to be hard masked.
      • If Verbose, output base count; print header lines of <=10 unique prefixes; print lengths of first 5 input sequences.
    2. Split: A new feature that will split the converted output into chromosome files.
    3. Summary: Improved determining ">ID" prefixes.
    4. ConvertNCBI: The mRNA products are no longer merged; either the gene description is used or the 1st mRNA product. The 1st mRNA was a gene attributed with keyword "ID=", now it is a separate keyword "rnaID=". The text "gene-" and "rna-" are removed from their respective IDs (to make name shorter). The option for assigning a list of protein IDs corresponding to all mRNAs for the gene has been removed; now only the 1st protein-id is listed. See NCBI convert.
    5. ConvertEnsembl: The gene has a new attribute of "rnaID=" with the 1st mRNA ID which corresponds to the input exons. It now shares the option of having a gene attribute called "proteinID" which is equal to the "protein=id" of the 1st CDS for the 1st mRNA. See Ensembl convert. Fixed a rare bug: This script removes the "[Source:..." text, but it would incorrectly remove any brackets.

  2. symap: Small interface changes:
    • Project Parameters: Allow comma's in minimal length cutoff.
    • The display name is checked for duplicates.
    • The abbreviation is shown on the Selected summary and "Project" renamed to "Directory".
    • Fixed a rare sort bug in Algo2.

  3. viewSymap:
    • Query: Allow comma's in Chr: start or end. The Show popup of a row's content has comma's in large numbers, and breaks up long All-Anno lines.

  4. Developers: Most of the xToSymap methods were simplified.

Release: v5.5.7 (8-Aug-24)

This release has added clarity and ease to loading the input files.
  1. NCBI and Ensembl files
    • xToSymap: A new interface for the convert NCBI and Ensembl scripts, and to replace the scripts/getLengths.pl script.
    • The Ensembl conversion now recognizes a chromosome sequence if the word 'chromosome' is on the header line. It has different rules on what is output.
    • The NCBI conversion now checks for "NT_", "NW_" and "NC_" prefixes (is was just checking for word 'chromosome' or 'scaffold'). It has different rules on what is output.
  2. Load Project:
    • This will now load most NCBI and Ensembl files directly, though it is strongly recommended to convert them with xToSymap.
    • The fasta file(s) are checked for a suffix of .fsa, .fa, .fna, .fasta, or .seq. The gff file(s) must have a suffix of .gff or .gff3.
    • In the GFF file, there must be a mRNA line for the following exons. The mRNA parent must match the gene ID, and the exon parent must match the mRNA ID.
  3. A&S: Algo 2 would crash on a rare case when sorting exons, which has been fixed.
  4. Demo: was updated to have correct GFF3 files.
  5. Documentation: has been improved for the input and build process.
  6. Developers: a new package called xToSymap has been added.

Release: v5.5.6 (15-July-24)

For existing databases, re-run the collinear algorithm as follows: Start ./symap -acs, then select the pair and run Selected Pair (Redo); a popup will have you confirm "Collinear only" indicating that only this algorithm will be run.
  • A&S Collinear algorithm: Improved this algorithm; see Collinear v5.5.6.
  • Query
    • New: A new Report button is at the top of the results table. See Query Report.
    • Bug fix for Multi-hit gene: it would crash if there was no hits for a pair of species.
    • The results table was sorted by Hit#; now it is sorted by Block or Collinear or Chr, and by Hit# within each group.
  • Developers: TableExport.java, TableReport.java and TmpRowData.java are new files. The output of Load Annotation is clearer if there is no annotation files.

Release: v5.5.5 (19-June-24)

For existing databases, if you ran Algo2 on mammalian genomes, you may want to re-execute the Synteny step to get better clustering of hits for very long genes.

This release is mainly with upgrades for Query with an option to view tandem genes.

  • Alignment & Synteny:
    • Algo2 for gene-to-gene hits, it was keeping hits separate when there was a very long gene with tiny exons; it will now cluster them into one. For gene-to-non-gene, it is clustering hits into bigger clusters.
    • Algo2 can take as input NUCmer files, but still does not do self-alignments. I tried humans and chimpansee with PROmer vs NUCmer, and PROmer identified more genes.
  • Query
    • Columns:
      • If the hit is not to a gene, the chromosome# will be shown for the Gene# column instead of '-' (e.g. '3.-' indicates the hit is on chromosome 3).
      • Columns PgeneF and PgFSize have been renamed Grp# and GrpSize, respectively. These are assigned values by the queries Gene#, Multi-hit Gene and PgeneF.
    • New:
      • The Gene# filter results have assigned values for the columns Grp# and GrpSize, where a group is associated with the species+chromosome.
      • For the Multi-hits query (see Mulit-hit):
        1. New options:
          Exon: only show hits if they overlap an exon in the target gene.
          Same chr: the n hits for the target gene must be to the same opposite chromosome.
          Tandem: the n hits for the target gene must be to tandem genes on the opposite chromosome.
        2. The columns Grp# and GrpSize are assigned values.
        3. Small change: If a chromosome is selected for all species, the hits must be on both chromosome (it was an 'either', now its a 'both).
      • On the results table, the View 2D feature has a new option of Group (see View 2D).
        • When the 2D is shown, all rows with the selected row Grp# and Chr will be highlighted.
        • The selected hit (and group hits) are highlighted in a new "Special" color; this can be turned off by selecting High on the Query interface before display; this release added the ability to turn the highlighting off by de-selecting the Hit Filter High popup option.
    • Interface:
      • The Multi-hits option has been grouped with the PgeneF option, and now has a check mark in order to run.
      • Collinear size now has an Ignore so it does not depend on having a number in the associated text box set to blank in order to ignore.
    • Small change: Annotation Description no longer automatically includes minor genes since this can be used in conjunction with Every* to get that result.
    • Bug fixes:
      • If a hit was to two minor genes (Every* option), the MUSCLE command would not work.
      • If one of the genomes was not annotated, the View 2D did not work.
  • Chromosome Explorer:
    • New option: The 2D display has a Sequence Filter of Conserved for the reference chromosome, which was renamed to g2x2 and option g2x1 added, where the first highlights all hit-wires and corresponding genes that are conserved across all 3 chromosomes, and the second highlights the hit-wires and corresponding genes that are only found on one pair.
    • The Gene popup coloring of the hit-wire has changed from the "hover" color to the new "special" color.
  • Developers: ComputeMulti.java is a new file.

Release: v5.5.4 (7-May-24)

  • 2D
    • Show Annotation: (1) This used a constant that was appropriate for plant genes, but not mammalian; the constant is now computed based on the average length of genes for each chromosome. (2) The boxes now have background light-gray and border in the exon +/- color. (3) If the last line of the annotation is long, it is now truncated.
    • A tiny problem has been identified and documented in the User Guide; search on "KNOWN PROBLEM" (this will be fixed for v5.5.5).
  • Circle
    • The Two-color all blocks option, which colors the inverted and non-inverted blocks two different colors, can now have its colors changes via the color palette icon.
  • View
    • Add AvgGeneLen and AvgExonLen for the average gene and exon length of each chromosome.

Release: v5.5.3 (28-Apr-24)

Interface changes for the Circle display.
Previous bug fixes: v5.5.1 had bug in 2D when no annotation. v5.5.2 had bug when genome>2000M.
This release has been very well tested (using these datasets and machines)!

Circle display

  • New color options so that they can be changed; see Colors.
  • The circle now remains the same size between different datasets and options.
  • Minor: (1) The mouse turns to a clickable finger when held over the project name (instead of beneath it). (2) The rotate uses smaller increments.
  • Minor 2D: (1) As long as the reference chromosome stays the same, chromosomes can be added/removed, or a different view can be shown, and the circle display will be the same according to the first set of top buttons. (3) A long project name may be truncated if shown on the left, but if shown on the right, the full name can be viewed by sliding the scroll bar.

Release: v5.5.2c (22-Apr-24)

22-Apr-24 bug fix release: A bug from v5.5.1 where 32-bit machines were no longer supported, but I failed to test this on large genomes (big thanks to Ricardo for this bug report!).

15-Apr-24 bug fix release: A bug that caused 2D to crash when displaying a sequence track with no annotation. This release still has date 31-Mar-24.

The control bars for Circle, DotPlot and 2D have been made to look and act in similar fashion. There are other small interface changes for clarity and new function.

  • Circle
    • Control bar: new icon buttons for rotate counter-clock-wise, Home and (i) quick help.
    • The scroll bar now works. The initial frame is smaller.
  • 2D-display
    • Control bar: Zoom and Show buttons with popup menus replaced the one Selected drop-down. The Scale icon is now a toggle.
    • Sequence Filter and Popup
      • New Full popup option to show the full sequence.
      • New Info (i) and gene Clear filter buttons; and tooltip on mono-texted items.
      • Instead of any change creating a history event, only a change to the coordinates does.
    • Hit Filter Popup: New All Hits popup option to show all hits.
    • Gene Popup: Now lists minor gene hits along with the major; add the block number for the hits.

  • Small changes:
    • Put Circle first on Manager.
    • Circle Control bar: The Self-synteny, Reverse and Rotate are now toggle buttons, and Scale is a toggle icon. The Self-synteny option is only displayed if self-synteny has been computed for at least one project displayed. The Rotate text button is moved next to the rotate circle buttons. The view drop-down has been changed to a popup menu.
    • Dot plot Control bar: The Scale option is now a toggle icon. The Home button now includes scale, and disables when it is home.
    • 2D: Information shows number of histories for > and <. The toggle shows on/off status.
      If a Hit Text option is selected, a prefix or suffix is added to the displayed number indicating its meaning.
      Hit Popup and Information: The first line has been reordered for clarity.
      Hit filter did not have default None checked.
      Sequence filter flipped option now waits until Save to flip (for consistency).

  • Developer
    • Removed props.ProjectPairs; renamed ProjectPool to be PropsDB and cleaned it up. It returns isAlgo1 so that the gene popup knows whether to show the exon score.
    • sequence/Sfilter.java now contains all filter code. The KB drop-down were not sized; hence, big on Linux. history/History.java has all un-used code removed. All control bars are created as a row, which makes the delimiter full depth.

Release: v5.5.1 (14-Mar-24)

This release has multiple small interface changes, as follows:

  • 2D-display
    • Control panel: Has a new option to shrink the area between the tracks.
    • Sequence Filter:
      1. Gene#: There is a new filter to show the Gene# beside each gene.
      2. Hit %Id: When the Hit %Id option is off, the length of the vertical line is constant; when it is on, it is proportional to the %ID. When the Hit Length is off, this line is not shown.
      3. (1) The filter window has been re-organized and the track number has been added to the title. (2) Fixed a bug in setting the start and end coordinates, which occasionally did not work. (3) If there was more than 3 tracks, the Conserved option was only available for the first occurrence of the reference track; it is now present for all occurrences. (4)
    • Small changes and fixes: (1) When a 2D display was initiated from the dotplot, one track could be longer than the other; this has been fixed (problem introduced in v5.5.0). (2) The Hit Filter Highlight Hit popup, when turned off, was still highlighting the hit gene(s); it no longer does.

  • Dot plot
    • The Stats pull-down has been changed to an "i" icon button for the help, and an "s" icon button for the statistics. Instead of writing the statistics to the terminal, they are written to a popup (where the text can be copied).

  • Developer
    • Changed most 'long' to integers (there were still 32-bit machines when this was 1st developed).
    • 2D merge overlapping hits for display; it smooths out the hit length display a little for highly conserved genomes.
    • 2D Control Panel changed Select pulldown to popup menu. Add space between control buttons (was squashed on linux). Less space is used between tracks when there are >2 tracks. The display from the Dot plot is back to pre-v550 length. On gene popup, was not suppose to show exon score for Algo1, but was showing 0.

Release: v5.5.0 (3-Mar-24)

For existing databases, there are no changes (though there are for v5.4.8).
To use SyMAP v5.5.0, you must have Java v17 or later.

  • The symap.jar was compiled with Java version 1.8 (8); it is now compiled with Java 17, which is more secure, faster and was released Sept 2021.
  • 2D display
    1. This display is much faster (really noticeable on slow machines or big chromosomes).
    2. The Hit Length line was only shown if the hit-wire was shown; this is no longer the case.
    3. The Stats button was removed, and the Help was moved to a little "i" button, which provides a popup of quick information.
  • Cluster Hits Algo1
    1. It no longer splits large genes since #2 above handles them better.
    2. The Summary now refers to this algorithm as Algo1 (modified original) and the alternative as Algo2 (exon-intron).
  • Developers: Lots of cleanup to remove OO spagetti code in 2d.
    • Recreate tracks on new display instead of trying to reuse.
    • Move number.GenomeNumber to Sequence; remove the number class. Changed all numbers in DrawingPanel from objects to long.
    • Merge History class with HistoryControl, and remove HistoryObject.java. Move HistoryControl and HistoryListener to symap.drawingpanel; remove history class.
    • Merge abstract Track.java with Sequence.java. Merge FSized2d.java with Frame2d.java. Rename frame.TrackInfo to ChrInfo and frame.Mapper to LeftMap

Release: v5.4.9 (20-Feb-24)

For existing databases, there are no changes (though there are for v5.4.8).
  • v5.4.8 little fixes: (1) The new Query Multi option had a small bug due to it using ">" instead of ">=". (2) The Query Every* option works for Algo1 as of v5.4.8, but was still disabled. (3) The Summary gave wrong message if EI or En were unchecked.
  • The Query for selected chromosomes and locations was confusing. It is now consistent and properly documented.
  • Improved documentation.

Release: v5.4.8 (13-Feb-24)

For existing databases, rerun the Synteny algorithm.

This release fine tuned the Cluster Hit Algo2 (first released in v5.4.6) and made some adjustments to Cluster Hit Algo1 for new interface feature.

  • Synteny Clustering Algo2: See Cluster hits.
    1. The algorithm better differentiates between overlapping genes which have the same aligned hits; one will be the major assignment and displayed for the hit, and others will be minor assignments. There is slight improvements to the assignments of clustered gene hits.
    2. The Cluster Hit Algo2 Parameter defaults check EI and En, along with EE.
    3. The g1 (one gene) and g0 (no genes) have improved assignments. The rules for filtering have been fine-tuned.
    4. The algorithm for finding piles and selecting what clusters to retain has been improved.
    5. Two scores are computed, which are the percent of the clustered hit that aligns to exons and to the genes.
    6. The hit strands could be different from the two gene strands; these can be printed out with symap -wsp or excluded with symap -wse .
    7. Previously, the hit strands could be assign "+/-" when it should be "-/+" and vice versa; this resulted in the wrong sequence being reversed in the 2D-Align; this has been fixed.
  • Synteny Clustering Algo1:
    1. Compute the percent of the clustered hit that aligns to the gene.
    2. Make an addition so that it works with the Query Multi command (otherwise, all results will be shown twice).
    3. The same problem as shown in 7 above happens with Algo1. It is mostly fixed, though a few slip through. See Strand.
  • 2d-display:
    • The Annotation (yellow boxes of text) have less information so more boxes can be viewed at a time.
    • The gene popup text shows the start/end of each hit on the opposite chromosome in place of the subhits; it shows ALL hits for the gene between the two chromosomes even if they are not visible; it shows the gene overlap (and exon overlap for Cluster Hit Algo2). Both the hit and gene popups have some minor formatting changes for clarity. See Gene and Hit Information.
    • The Align display has the Gene# along with the Id and product (Desc).
  • Query:
    • Add a column to show the percent exon overlap of the hit (Cluster Hit Algo2) or the percent gene overlap (Cluster Hit Algo1).
    • Search changes:
      • A new Multi gene annotation option has been added, which lists all genes with multiple hits.
      • The Every* now works for both Algo1 and Algo2, and the '+' was replaced with a '*' (so would not be confused with strand).
      • The Hit# query option shows all major and minor genes that overlap the hit, where the minor are suffixed with an '*'.
      • The Gene# query now allows a chromosome to be selected for each species, and shows all major and minor hits.
      • The Annotation search with one chromosome selected has been changed so that the annotation has to be on that chromosome (it was allowing the annotation to be on the paired chromosome).
      • The Export for .csv and .fasta allow appending to an existing file.
  • ConvertEnsembl: Was not replacing special characters, which has been added.
  • Developers
    • The formatter for the yellow annotation box was re-written.
    • The Synteny clustering Algo2 was rewritten to be clearer. The amount that a cluster hit overlaps the exon of a gene was an approximation, now it takes into account overlapping subhits and removes the overlap from the calculation.

Release: v5.4.7 (14-Nov-23)

For existing databases: if using Cluster Hit Algo 2, rerun the Synteny.
  • Query: A new Every+ option for databases build with Cluster Hit Algo 2 post-v546. It shows all genes with a hit aligned to it, even the overlapping ones. This also is used for the Annotation search.
  • Alignment&Synteny: Fixed a bug in the new Cluster Hit algorithm; the bug did not effect the visible results except to have a slight fractional difference in the Summary GNwCH statistics, and the Query NumHits column on a few genes. The new Every+ option is disabled for pre-v5.6.7 database, as that shows this problem.
  • Developers
    • Removed deprecated for Java v20: "exec(String) in Runtime has been deprecated" and "URL(String) in URL has been deprecated". All export routines call Globals.getExport(). Set some gray buttons to white for linux.

Release: v5.4.6 (22-Oct-23)

For existing databases: There will be a small database update on the first viewing. For the new cluster hit algorithm, change the Cluster parameters and rerun the Synteny.

  • Alignment & Synteny: A new Cluster Hit algorithm has been written to filter and cluster MUMmer hits; the improvement over the original algorithm is that it has knowledge of the exon-intron structure and identifies all gene pairs. See Cluster parameters.
  • Query has a new column that states if a hit is to an exon-exon, exon-intron, etc. This is works with the new cluster hit algorithm. This information is also on the Hit Information.
  • Minor stuff
    • 2D Sequence Filter Gene# The full gene is highlighted instead of just the introns.
    • Summary The Explain did not format right with Java v20, which has been fixed. The date at the top had "Oct" swapped with "Nov".
    • A&S Cancel: SyMAP was automatically removing any added hits or blocks, but for big databases, the removal can take a long time, hence, there is a popup to request whether the user wants this done; otherwise, they may prefer to just remove the database and start over.
    • Summary contains the number of MUMmer files read when using previous alignments (this provides a clue as to whether Concat was used).
    • The documentation stated that all genes with a hit could be shown in the query, which was not true; the documentation has been updated.
  • Developers
    • The original Cluster Hit code has been moved to backend.anchor1. The SyProj was renamed to backend.anchor1.Proj, and now only used by the Cluster Hit code. The new Cluster Hit code is in backend.anchor2.

Release: v5.4.5 (4-Aug-23)

  • Tested on Linode Ubuntu.
  • Chromosome Explorer 2D
    • Sequence Filter: See Sequence Filter
      • Gene#: A new option of Gene# has been added, which will set the coordinates of a 50kb region around the gene and highlights the gene's introns.
      • Block#, Collinear#: There are new options to show the Block# or Collinear# in the same way as the Hit# and Hit %Id are shown.
      • Highlight Conserved Genes: There is a new option to Highlight Conserved Genes. See Conserved.
    • Hit Filter
      • The Highlight hit popup can be on all the time (i.e. it is not turned off if another highlight is turned on), and is highlighted in the hover color.
      • When the hit popup highlights the hit-wire, it will also highlight the genes assigned to it.
      • The hit popup provides the Gene# of the genes that are being highlighted.
  • Minor stuff
    • Query Table Fixed tiny bug from v5.4.2 where the Glen column would not sort.
    • Query Align with MUSCLE, the Cancel Alignment button does not work, but now, it gives a message on how the user can cancel the alignment.
    • Block View A clock shows when bringing up the 2D view (necessary for slow machines).
    • 2D Sequence Filter (1) The highlighted popup gene now highlights the whole gene (it was just the intron area and delimiter). (2) Setting sequence filters could cause consecutive duplicate views in History, i.e. forward/back buttons, this has been fixed. (3) Old bug: when a track was reused with a different sequence, the conditional checkboxes could be incorrectly disabled (still one problem left but need to rewrite masses of code to fix).
  • Developer
    • Rename frame.TrackCom to frame.TrackInfo;

Release: v5.4.4 (18-July-23)

  • Query
    • Bug fix for v5.4.3 The extension to make the Gene# search work with suffixes (e.g. 5.a) had a bug when used without suffixes.
  • Chromosome Explorer 2D
    • Sequence Filter: There is a new option for highlighting a gene called Popup; when this option is on, every time a gene popup window is created, the gene stays selected. The filter popup has reduced number of options. See Popups and Sequence Filters.
    • The highlight Hit popup and Gene popup options are on by default.
  • Developers
    • Remove CloseupListener and DrawingPanelListener. Remove DrawingPanelData (merge with DrawingPanel). Move TextData from util to sequence directory. Moved frame SyMAP2d, Frame2D, FSized2d, ControlPanel to drawingpanel because used exclusively by 2d. Rename AlignMain to AlgSynMain.

Release: v5.4.3 (12-July-23)

There are numerous tiny changes, fixes, and two small new features.

For existing databases: (1) View the database with ./symap and okay the database update. (2) Though it is not necessary to rerun the Synteny algorithm, there is a small improvement for assigning hits to embedded genes.

  • Manager
    • Synteny For embedded genes, the algorithm was assigning the hit to the outer gene, now it assigns it to the closest fit gene.
    • Minor: When All Pairs or Selected Pair is selected, the user will be asked to confirm the processing. The panel is a little smaller. All frames show the database name. Allow blank lines on annotation input. Add "Prefix" on manager summary.
  • Chromosome Explorer 2D
    • Hit Filter: There is a new option for highlighting a hit-wire called Popup; when this option is on, every time a hit-wire popup window is created, the hit-wire stays selected. See Popups and Hit Filters.
    • When more than two sequence tracks are shown, the middle one now knows the hits on both sides and can list them in the Gene Popup and Align (Max 30kb).
    • Minor: The Sequence Filter options of Show Text now shows the text on the outside of the sequence track rectangle. The panel is a little smaller. The "Id %" is now shown on the Hit Information. Put Info: at beginning of controls. Text popups stay in front, i.e. will not be buried behind panel. Improve the 2D information help.
    • Tiny bug fixes: The end coordinate on this filter window always showed the end of the sequence instead of the displayed end. If the Dot Plot was changed multiple times during a given session of ChrExp, it sometimes messed up. The Dot Plot filters could not be changed after the first view.
  • Query
    • The Gene# query option now allows a suffix; i.e. '100' will return all genes with this number and any suffix, '100.b' will only return this gene.
    • Bug fix: if an annotation query was performed that had a hit to a gene with embedded genes, the gene and the overlapping genes would be returned.
    • Minor: Has a new button Defaults which sets the selected columns to the defaults. In the result table, the Gene# is now the first column.
  • DotPlot: Minor: The Filters Defaults option no longer closes the window (behaves like the 2D filters).
  • Developer
    • The database 'props' table now keeps track of initial database version and updates. The file util.Utilities has all the methods for parsing the sequence 'tag'; the database tag field has a slightly different format. Fix a tiny problem where Query would set a 2D static 'show annot' which would effect other 2D displays. ShowContinue put 'Cancel' as the end button. Remove [Uniprot ID: xxx] from demo_seq/annotation/anno.gff.
    • Moved track/* to sequence directory. Merged SequenceTrackData into TrackData. Removed AnnotationData and PseudoData (was just moving data around). Rename PseudoPool to SeqPool. Remove GroupSorter (code moved to SeqLoadMain) and improved. Change database field pseudo_hits.evalue to 'htype' and rearrange the fields to be more logical.

Release: v5.4.2 (28-June-23)

  • 2D
    • Redesigned the Sequence Filter.
    • History only kept 10 places, now it keeps 50.
    • Fixed bugs:
      (1) when the 2D was initiated from the DotPlot or Query, changing highlight could change "Show" option. (2) Recent v541 DotPlot to 2d bug: If a second block was selected for 2D, the genes would not be shown; with this fix, multiple 2D from Dotplot can be shown again.
  • Query
    • Added an export option of HTML.
  • Developer:
    • The Abstract Filter file was removed and the Hit and Sequence filter codes rewritten without it. Changed number of tracks from 100 to 30.

Release: v5.4.1 (22-June-23)

For existing SyMAP databases: there is a new column in Query called NumHits; run symap -y to update all projects in the database.
  • Manager
    • Bad bug fix for v5.4.0: Loading the hits would get a memory crash if some chromosomes in a project had genes and other chromosomes did not.
    • If only two projects are selected from the left, then their joint cell is automatically selected.
  • Dot Plot
    • Info: (see Control Panel) A new pull down with options: (1) Stats: in the Information area (short form), (2) Print: printed to the terminal (long form), (3) Help.
    • The dot colors can be changed with the Color Wheel (see Color Wheel).
    • Dotplot can no longer display multiple 2d from the same dotplot display. (v542 allows many again)
  • 2D
    • Info: A new pull-down with options: (1) Stats: Hovering in the gene space but not on a gene will show summary info; hovering in the hit space but not a on hit will show hit summary info. (2) Help Help will be shown instead of summary information.
    • Display bug fix: If the "Show" filter was changed to show more hits, then changed back to show less, the right click in hit empty space would keep showing hit-wire information.
  • Query
    • Gene# in the Pair Hits section: This query now works exactly like the other 3 in this row, i.e. you can set the chromosome, and it returns full hits. NOTE: there will be no rows returned if there is no hit to the gene.
    • Single NumHits: A column has been added to indicate whether there are any hits to the gene in the entire database (not just those queried).
    • Column Description: This was being shown on the bottom of display; it is now a tool tip.
    • Clear Column: This button has been added. It replaces the Show Stats button when the columns are being shown.
  • Developer:
    • The MySQL connection has been rewritten to be more straight-forward and confirm that connections are correctly closed. There is a new dedicated connection for any each of the displays (e.g. DotPlot, 2D, etc). This removed 4 java files and replaced one.
    • MacOS Monterery no longer has "Times and Lucida", used by Serif, neither does JDK until v18, so these were removed. However, removing this only occurrence does not get rid of the Java message.

Release: v5.4.0 (6-Apr-23)

This release focuses on improving the detection of gene homology in the Synteny computation.
For existing databases, rerun Alignment&Synteny using existing MUMmer files.
  • Synteny:
    • Creating clustered hits has been improved for long genes.
    • Filtering hits has been updated to give preference when at least one end is a gene.
    • Details of changes:
      → Long hits with embedded genes are split, which helps detect hits to the embedded genes.
      → When a candidate gene is created from hits, it is checked that it does not span a gene with no hits. Candidate genes used to be created using the maximum length of the annotated genes; this is now maximum size 100k.
      → For finding the top N hits, the Max of query and target(summed subhits (minus overlaps)) of a clustered hit are used instead of summed query subhits.
      → A clustered hit could have a mix of subhits that were on opposite strands and the same strand; this does not happen anymore. (This is rare.)
      → %Identity was just using the query match; now it is Max(pct(query), pct(target)) where pct is the approximate average %identity of the subhits.
      → %Similarity was the value of the first subhit of a clustered hit; now it is computed like the %Identity.
      → Pair Parameters: The TopN and MinDots parameters are checked for integer and >0. When A&S is executed, the non-default parameters are printed to the terminal immediately.
    • These changes increased the number of gene homology hits for Humans to Mouse (Chrs 5,17,X) from (68.6%, 57.0%) to (79.9%,66.3%), and for Humans to Chimpanzee (Chrs 5, X) from (53.9%, 54.6%) to (69.9%, 56.7%).
    • Note, these changes cause more hits to be saved, and hence, the synteny takes a little longer. It may also remove or add small blocks due to additional intervening hits.
  • Summary: See Summary.
    • The summary has been rewritten in a textual format which can be copied with copy & paste.
    • All hit and block statistics were computed in terms of clustered hits, which may have gaps and overlaps between the subhits. Now many of the statistics are in terms of the subhits, and it is clear as to whether clustered hits or their subhits are being used.
    • Collinear sets have been added to the summary.
    • The Project summary has exon stats, which have been removed from the View; the gene size ranges have been removed.
    • It states whether either input project had Order Against or Mask genes set.
  • Query:
    • Hit Cov is a new hit column, which is the maximum subhit coverage between the two species connected by the clustered hit. The Hit Cov is computed by summing the subhits and removing any overlap.
    • Annotated (Gene Hit) has an additional option of Either.
    • Collinear size now has the options of >=, =, <=.
    • Export CSV has an option to not include row number column. This is useful for comparing two lists, where the row number can cause most rows to appear different. The quotes around columns have been removed; any comma's in the descriptions are replaced with ";".
    • The Show was disabled for singles; it has been enabled.
  • Chromosome Explorer:
    • The text of the Hit popup has been slightly changed and the Subhit coverage has been added (see Hit Popup).
  • Developer:
    • The Hit assignment algorithm has been restructured.

Release: v5.3.5 (3-Mar-23)

  • Cancel
    • The Cancel button did not always work well, now it does. See Cancel.
    • From v5.3.4, if the A&S was cancelled or interrupted, it was not possible to restart the A&S without reloading the projects.
  • Load project
    • Any sequences that do not have the Group prefix will be ignored.
    • The Group prefix is case-insensitive.
  • Align (Max 30000)
    • The selected region was extended to the whole genes within it; now it just extends to the ends of the last overlapping exon of the selected region.
    • The exons are labeled, e.g. Exon #11. If the end of the gene is not shown, a left or right arrow will be displayed indicating that it continues. See Align.
    • The color icons now has Exon+ and Exon-, where the default colors correspond to the 2d display.
    • Bug fix: when there were overlapping genes, the exons could be wrong for one of them.
  • Developer:
    • The Logger interface has been removed. In the ManagerFrame, the load methods have been moved to LoadProj, and the remove methods have been moved to RemoveProj. The AnchorsMain methods has been reworked for clarity, but provide same answer. The debug statements were cleaned up.

Release: v5.3.4 (19-Feb-23)

Existing SyMAP databases: if the project parameters seem wrong, bring up the Parameters window and Save.
  • All "Help" in the SyMAP Project Manager has changed to ? for online help. The online has been improved for the parameters explanation; see Quick Guide and Parameters.

  • The Project Parameters window is clearer; see Project Parameters. Since parameters are written to file and database, they can get out of sync; this problem is reduced. The popup messages are more concise.
  • The Pair Parameters window is clearer; see Pair Parameters. The parameters used to only be applied if A&S was run right after setting them; now they are maintained across symap sessions.

  • Developers:
    • There has been a lot of renaming and restructuring. For example, the symap.projectmanager.common package has been shortened to symap.manager, and a few of the classes in this package were renamed for clarity. All ChrExp and 2d display classes have been renamed and moved to the frame package. A database package has been created for the database specific classes. The project and pair parameter code has been redesigned.

Release: v5.3.3 (24-Jan-23)

This release has various minor changes and additions.
  • Synteny&Alignment
    • Synteny algorithm: The block start and end was assigned coordinates at the mid-point of the first and last hit; this has been changed to the start of the first hit and end of the last hit.
    • Selected Pair Parameters: The No Overlapping Blocks was removed because it does not work. The Summary shows changes from defaults for the MinDots, TopN and Merge Blocks parameters.
  • Print Image
    • Most SyMAP graphics have a print image icon, which was not working for post Java V8; it now works for any version.
  • Dotplot
    • Filter - see DotPlot Filter.
      • New Green, ID%, Mix and Show Block Numbers options. The dot size was moved from the main display to the filter. Add ? button.
      • The filter display now freezes the other panels. It no longer resets to default Filters on Home or selecting a cell.
    • The cells display all at once; before they were added one at a time.
    • Home now works for the 2-chromosome dotplot, which goes back to full view.
  • Developer:
    • Compiles under Java 17 with no warnings (removed Observable). Massively reduced and cleaned Dotplot code. Removed many un-used files from the util package and moved some others to the package of their sole use. Removed arrow from the end of the gene in the CloseUp. The classes_ext have been cleaned of most unused classes and files.

Release: v5.3.2 (14-Jan-23)

This release added ? buttons for online help, and has some small Query and 2D Color Icon updates.
  • Load and synteny: View will show the sequence and annotation paths used along with their date. If an existing alignment is used, then Summary will show the date that the alignment was run.
  • Chromosome Explorer 2D
    • Color Icon
      • Default: Changed from resetting all colors to just resetting the colors for the current tab.
      • ?: a new button that links to the online user guide for instructions.
      • Fixed an old bug where the Default did not always work until viewSymap was restarted.
    • The Sequence and Hit filter, and the Selected pull-down, have ? links to their online help.
  • Queries
    • The selected columns will be saved, as detailed in Save Columns.
    • The Help has been changed to ? for online documentation.
    • A new html page has been created for the Queries at Query
    • The columns would sometimes be displayed in a weird order, which has been fixed.
  • Developers: Created utils/Jhtml.java that has all html links and code in one place. The /properties only contains colors and mainly those that can be changed by user.

Release: v5.3.1 (6-Jan-23)

Improvements to viewing the alignments of hits; see Alignment.
  1. 2D
    • Align (Max 30,000)
      • Each MUMmer subhit is now shown at the bottom as a full dynamic programming alignment (it was broken up at each gap).
      • The MUMmer subhits can overlap, so they are now staggered in the graphic display.
      • The sequence is no longer converted to uppercase, i.e. if the sequence is soft-masked, it will show on the alignment.
      • The hit color can be set in the color wheel.
    • Hit Information:
      1. Align Hit New : When hover over a hit-wire and right-click, an information box pops up; that box now has an Align Hit button that shows a text alignment in blast-style format.
      2. The Hits for each track are listed in descending start order, and the gap is negative if it overlaps with the previous.
    • Show Seq Options New: The Show Sequence is now Show Seq Options which has a popup menu that allows the user to select whether to show the hit sequence, reverse hit, or region from the selected area.
    • Sequence Filter: Has a New option Show Hit# which shows the hit# in the same location as the %Id Value (only one of these two option can be active at a time).
    • Tiny problems fixed: (1) Fixed a problem where a sequence could not be selected for Align or Show Sequence if the zoom region was small. (2) When the view went from 2D->Circle->2D, the "Selected:" would not be reset to "Zoom All". (3) The arrow has been removed from the graphical align hit because it meaning was not clear. (4) The term 'merged hits' was changed to 'clustered hits', where the cluster has sub-hits; the term merged could be misleading.
  2. DotPlot
    • Multiple 2d displays can occur now.
  3. Developers: (1) Removed over 15 useless files of code. Removed used Cache from everything. (2) Cleaned up the Mapper package hit code and Closeup package.

Release: v5.3.0 (10-Dec-22)

For existing SyMAP databases: the database schema will be updated on the first viewing of an existing database.

This release does not have any new features, but FPC was removed from the schema and code, which is an major internal change. Typically the SyMAP software can view an earlier or later version SyMAP database, but that is not the case with this release.

  1. Queries: fixed a v5.1.9 bug in location filtering.
  2. Dotplot: fixed a v5.1.9 bug; if a region was selected, the Hit Filter was set to show no hits.
  3. Github:
    • Removed most of FPC from the code and Schema.
    • Simplify classes_ext and Makefile

Release: v5.2.1 (1-Dec-22)

For existing SyMAP databases: The colors in the circle view will be different for the 2nd project.
  1. Circle view
    • Whole genome: Reverse: A new function that allows the reference to be reversed. Also, the colors of a project's blocks stays consistent if compared to projects with different number of chromosomes (n<=25).
    • Chromosome view: The selected reference will have its colors shown by default.
    • The project name that is reference is in bold italics.
  2. Queries:
    • High: Added check box to unselect if the user does not want the selected hit highlighted.
    • Improved the information for the MUSCLE alignment. See Query Align.
    • Fixed Clear Filters to leave Chr: enabled.
  3. SyMAP manager:
    • View: A new link that brings up the chromosome information for a project.
    • Collinear computation bug fixes: (1) self-chromosome can cause error so disabled. (2) If a chromosome had no hits (can happen with small scaffolds), collinear was not computed.
  4. 2D Sequence Filter bug: "Apply" could perform an unwanted change of boundaries on sequence.
  5. Developers: Removed 3D from code, classes_ext.tar.gz, Makefile. Put in a check for no projects, as a user's trace showed it can happen. Remove the Filtered interface. Removed more FPC code (Drawing Panel, Mapper Pool, SyMAP Frame). Changed 2 Observers to PropertyChangeEvent. Reduce /props from 4 to 1 file.

Release: v5.2.0 (20-Nov-22)

For existing databases, see Update; the synteny algorithm needs to be rerun to get the collinear changes.

This release is concentrated on improving the algorithm and display of collinear sets; see details. There is new viewing support for Collinear sets: see 2D Hit Filter and Queries.

  1. The symap and viewSymap scripts have been changed to access SyMAPmanager instead of ProjectManagerFrame. NOTE: if you are updating, make sure you get these new scripts!!
  2. Processing
    • Collinear sets:
      • The collinear sets algorithm has been greatly improved for more accurate results. The Alignment&Synteny needs to be rerun to compute the updated sets.
      • Each collinear set has a unique number for the chromosome pair.
    • The Hit# used in Queries and the 2D display was the index into the database and could get very large. A hit number is now computed during processing, which is a sequential number for the chromosome pair (e.g. there will be a Hit#1 for each chromosome pair with hits).
  3. Chromosome Explorer 2D display:
    1. Hit Filter:
      1. Has many new options, see Hit Filters
      2. The Cancel and Defaults buttons never did work right; they do now.
    2. Color wheel, for Hits, there are two new options:
      1. Hover provides the color the hit-wire turns when the mouse hovers over it.
      2. Highlight1 and Highlight2 provides the highlight colors, which alternate when Collinear sets or Synteny blocks are shown.
    3. Sequence Filter:
      1. Gene Delimiter: This is a new option, which draws a black line over the top exon of each gene when they are zoomed enough to see the exons. See 2D view.
  4. Queries
    1. The Show Synteny has been changed to View 2D with the options (2nd two are new):
      (1) Region (kb), (2) Synteny Block, (3) Collinear Set. See Results.
    2. Show: This is a new option which produces a popup of all data for the selected row. This allows copying any text.
    3. The collinear column displays the set as N.M, where N is the size and M is the unique set number.
    4. Gene#: This is a new query, where one gene number can be entered (no suffix). This is a good way to view all hits for a given gene.
    5. Hit St: There is a new column, which is "=" if the two hit ends are equals, else it is "!=".
    6. A hit may align to overlapping genes, but only one hit pair is shown; now the best overlap for each side is shown (before it was random).
  5. Assorted minor stuff
    • ConvertNCBI script has a new "-p" and "-pa" options to include as attributes the first protein or all protein names for the gene.
    • Schema update: there is an schema update that occurs the first time an existing database is read.
    • The SyMAP Queries button has been moved and renamed Queries. The 2D Filter and Color Wheel panels will be displayed in the middle of screen. The 2D Information box now shows the Hit Filter settings.
    • The SyMAP version of running synteny is on the summary panel.
    • All visible traces of FPC have been removed. From this release forward, FPC will no longer be supported. However, there are still releases from AGCoL and Github that support it; see Update.
  6. Developers
    (1) All 'new Integer()' etc, have been removed. (2) Schema update: The 'runum' has been added to the hits table, which provided the set number for collinear pairs. The SyMAP version is now stored in the projects and pairs table. The annotation table contains a new column for the number of hits. (3) FPC code is all mixed in with the seq-to-seq; but it is no longer consistent with changes.

Release: v5.1.9 (26-Oct-22)

For existing databases, see Update.

Release v5.1.7 to v5.1.9 greatly improved the Gene# assignment algorithm and the 2D gene placement.

Added -z option: To execute the latest Gene# assignment algorithm, execute ./symap -z, then select Reload Annotation; the synteny algorithm does NOT need to be rerun.

(25-Oct-22)

This release has some interface improvements to SyMAP Query and the 2D gene placement algorithm was improved.
  • SyMAP Project Parameters
    • The Parameters display has a new parameter called abbrev_name which must be a 4 character abbreviation. This is used in the SyMAP Query column headings. If one is not entered, then the last 4 characters of the display_name is used.
    • Bug fix: A Save after annotation was loaded would zero out the counts for annotation keywords, which resulted in no annotation keywords (only All_anno) being shown in the Query Columns; this no longer happens.
  • 2D display
    • The gene placement algorithm was improved.
  • SyMAP Query
    • Hit Pairs Table: The Location columns now include the Gene start/end/len and the strand assigned to the gene in the GFF file.
    • Single Genes Table: Only shows the columns that are relevant to genes.
    • Hover over column: the description is shown at the bottom.
  • Load Annotation: This computes the Gene#, and was allowing a gap of 3 without calling them overlap; this has been changed to 0.
  • Demos: A few small changes to the annotation files. Params updated.
  • Developers: Cleaned-up trace and debug flags.

Release: v5.1.8 (17-Oct-22)

For existing databases, see Update.

This release continues to concentrates on making the 2D display provide more information.

  • Processing
    • Gene# assignment: the algorithm has been redone so that all overlapping and contained genes have the same number with suffixes a-z; and if there are more than 26 overlapping genes, then it starts over with numbers following the a-z; e.g. Gene #100a, #100z, #100a1, #100b1, ...
    • Load sequence: the project will now be removed if the load fails.
  • 2D display
    • The placement of overlapping is greatly improved; the improvements depend on reloading the annotation for v518 (though it works as before without reloading).
    • When the mouse is in the gene space, it will only provide Information if the mouse is over a gene; this makes it easier to run the mouse down the gene space to find a given gene number.
    • The length of the exons is included in the gene tag, e.g. Gene #100a (Exons 3 2123bp)
    • Right click in the yellow annotation box was not working for 2D from Dotplot or SyMAP Query, so it has disabled since the right click on gene annotation works fine from everywhere.
  • SyMAP Query
    • The Gene# column now shows the number and suffix; it is of format chr.genenum.suffix. For older databases, this does not provide useful information as it expect v5.1.7 format or later.
    • Single: The logic has been changed so that if you select a project, then you can select a chromosome.

Release: v5.1.7 (12-Oct-22)

For existing databases, see Update.

This release concentrates on making the 2D display provide more information with different colors for genes and hits.

  • Processing
    • Gene# assignment: have been numbered with the same gene number; now they are given a suffix, e.g. Gene #1011a, Gene #1011b. For existing databases, the annotation must be re-loaded to get this feature; then the synteny needs to be recomputed (existing MUMmer files will be used).
    • ConvertEnsembl has a new option "-u" to use the sequence names from the FASTA file.
  • Chromosome Explorer:
    • Colors (Color wheel in upper right corner):
      • Hits are now colored brown (hits have the same strand on both sequences) or green (different strands).
      • Genes can now be colored according to if they are on the positive or negative strand, where the default is dark blue (positive) and purple (negative).
      • The FPC options are no longer shown if there is no FPC in the 2D display.
    • Annotation
      • Overlapping genes are now staggered, i.e. the exons of one overlapping gene will be displayed further out than the other.
      • When right-clicking on a gene or its yellow text box, the results now show the hits to the gene along with the gene information and exons.
      • When right-click on a hit-wire (between two chromosomes), a popup of its merged hits is shown (or single hit).
      • Tiny bug fix: Yellow text box: if one clicked on the outer edge, it would popup the wrong gene.
  • SyMAP Query:
    • The Colinear genes was filtering on > instead of >=. It also has new filter of "="; hence, you can set the chromosomes and the Colinear genes to get all colinear runs of N on a specific pair of chromosomes (e.g. 3.1.9).
    • SyMAP does a better job of making filters inactive if they will not be used or have not been "checked".

Release: v5.1.6 (20-Sept-22)

This release has improvements to the Block display, 2D display and SyMAP Query.
  • Blocks
    • For the single chromosome view: Inverted blocks have dotted outline.
    • The windows are better sized.
    • Fixed rare bug: For the initial multi-chromosome view: If there are many chromosomes, they were being cutoff; now the scroll bar allows them all to be shown.
  • 2D Display
    • The full gene description can be viewed by right clicking on a gene annotation in the track; see UserGuide.
    • Hover over a hit may include "Inv" (inverted block) and will include "gNrM" (N exon hits, M colinear run size); see UserGuide;
  • SyMAP Query
    • Add general columns Hit %Id, Hit %Sim, Hit #Merge
    • Add species column Len which is the length of the hit (or merged hit).
    • Bug Fix: Show Synteny: the hover over the hit only showed the correct hit Information for those below the selected hit - that is fixed.
    • Gene# of 0 will sort to bottom of table with a '-'.
    • MUSCLE alignment: (1) Fixed rare bug. (2) The alignment files will be left in the muscle directory, but over-written with each new alignment.
  • Alignment&Synteny: (2 tiny things)
    • If promer or nucmer are over-rided in the Parameters, it will say on the Summary page (unless the alignment is re-used).
    • Fixed rare bug: if the Parameters are changed before any alignment has been done, there is an error because the data/seq_results directory has not been created.

Release: v5.1.5 (13-Sept-22)

  • Online Docs - the biggest change of this release is an updating of the documentation.
  • Alignment & Synteny
    • Clear Pair provides option to "only delete synteny from database" or "delete synteny and remove files".
    • The MUMmer %Sim column is now loaded in addition to the %Id column.
    • The Hit# will be assigned sorted on the query start coordinate.
  • Chromosome Explorer
    • Hover over Gene: the gene information will be shown when hovering anywhere within the gene's rectangle. (It used to only show Gene information if the mouse was hovered directly over the center.)
    • The Hit information now has %Id and %Sim, plus the number of merged hits.
  • Blocks - the chromosome blocks view has a smaller window.
  • demo_seq2 - Removed duplicate genes (worked okay, but technically wrong).
  • Developers
    • If all hits are removed (e.g. Clear Pair), the index will be reset so that the next loaded hit will have a HitIdx (Hit #) that starts at 1.
    • The alignment files may have header lines. Any line that does not start with a digit will be ignored.
    • Removed some dead tables from Schema and changed a few datatypes.

Release: v5.1.4 (4-Sept-22)

  • SyMAP Query
    • Filters:
      1. Added a General section for the annotation and location, since they can work with all other queries. Made the Single project (Orphan genes and all genes) a separate section.
      2. Add a filter to Show genes w/o hits for a project, which shows all genes.
      3. Changed the Orphan genes filter to show orphans relative to the chosen projects (it was across all synteny project, even projects selected for the SyMAP query).
      4. The Hit# query did not work in some situations.
    • Improved the query summary. Also, the location had been left off the summary for most filters.
    • Replaced the result table #Gene column: it was computed per query, whereas the replacement is stored in the database. The replacement is the order number of the gene along the chromosomes, which stays constant across all queries and is shown on the Chromosome Explorer Annotation and the hover of the gene.
  • Chromosome Explorer
    • Sequence Filter: If a wrong coordinate was entered, the program exited.
    • Alignment Bug Fix: only the first part of the alignment was working (fairly recent bug).

Release: v5.1.3 (30-Aug-22)

There is no major new development, but small interface improvements and SyMAP Query has a small bug fix.
  • Main window
    • Projects listed on left are now in alphabetical order per category. If the project is loaded into the database, the load date is shown to its right. If it has computed synteny, the number of syntenies is in brackets.
    • The symap window is centered, along with the Chromosome Explorer and SyMAP Query window.
    • viewSyMAP: only shows the projects that have synteny computed.
  • Load Annotation
    • grp_prefix: This can now be removed from existing projects, i.e. if the 'Chr' was left on, so the 2D display is clustered with Chr01, Chr02, etc, then the 'Chr' can be removed.
    • annot_kw_mincount: the Help made no sense for this. It has been upgraded to do the following: the SyMAP Query shows the GFF attribute columns that have >=annot_kw_mincount occurrences. It can be changed at any time through the symap parameters window.
    • When the annotation is loaded and the attribute (column 9) is parsed for keywords, the only keywords considered are those followed by a '=' (it used to allow spaces between keyword and value).
  • 2D chromosome view
    • The popup annotation is now a smaller constant sized scrollable window.
    • The annotation hover and popup coordinates are more readable.
  • SyMAP Query
    • The Block query is now on just the block number along with the selection of chromosomes.
    • Small changes:
      • The Results Panel includes the number of rows in the query.
      • The query will not run if there is an error in the input, and there is better error-checking.
      • The window is smaller and the columns panel in the table is fixed width.
      • When the full block name is shown, any leading zeros are removed, e.g. 01.03.2→1.3.2.
    • Bug fix: The Orphan query with one or more chromosome selected gave wrong results (it was obvious); moreover, if a start or stop was entered, it gave an error. This is fully functional now.
  • Convert
    • Both scripts work if there is not a gff file (they were requiring it).
    • For both scripts, the bases that are not ACGT are summarized.
    • ConvertNCBI - anything that does not have "chromosome" in its ">" line is considered a scaffold. (ConvertEnsembl already does this).

Release: v5.1.2 (19Aug22)

This release associates Exons with their genes; hence, your database needs to be updated.
  • Annotation
    • Input format:
      1. The ConvertNCBI and ConvertEnsembl scripts created separate gene.gff and exon.gff files; now they create one file. The ConvertEnsembl was not producing a gap.gff file, and now it is. Both have been updated to be aware of more input variations and provide a better summary. See Convert for details.
      2. The Demo_seq has separate gene.gff and exon.gff files; now it has one file.
      3. SyMAP can still load separate gene and exon files, but its association of exon to gene is not as accurate.
    • Schema change
      1. An existing database will be updated the first time it is viewed.
      2. This adds a MySQL field to associate Exons with their Gene; for an existing database, the annotation would need to be reloaded for this to happen and be displayed (see 2d Gene Popup below).
    • Load Annotation
      1. The code was treating 'CDS' as 'exon', now only exons are loaded.
      2. The attributes were not being loaded for Exon; now, the "parent" attribute is loaded for the exon and shown when it is moused over.
      3. Quotes are removed from attributes.
      4. The exons are now being mapped to the parent gene to be used in the 2D display.
  • Chromosome Explorer
    • Tiny bug fix: the "Circ" button get stuck in un-active mode.
    • 2D display
      • Tiny bug fix: For a single contiguous hit for a flipped chromosome, the hit length graphic was not aligned right.
      • When the Gene or Exon is moused over, it will include the number of exons or exon#, respectively.
      • When the hit line is moused over, it will include the project name, start and end coordinates.
      • 2d Gene Popup: Includes the exons for the gene.
  • SyMAP Query
    • Add warnings for bad block or collinear input.
  • Developers
    • The code is now Java v8 compliant (no depreciated statements).

Release: v5.1.1 (31July22)

These changes are to 'tidy up' things; nothing major.
  • Scripts:
    • The symap and viewSymap scripts have been changed from Perl to shell.
    • The viewSymap3D script has been removed, but the 3D documentation explains how you could recreate the 3D view.
  • Data directory
    • The data/fpc has been tarred and zipped (fpc.tar.gz) since it is rarely used anymore. If the data/fpc directory does not exist, then the BLAT args are not shown on the "Param" window. The FPC-Seq alignments still work.
    • If the data directory does not exist, it is created: (1) on "Add Project", (2) when redoing an alignment in the database. (The directory will exist if using symap_5.tar.gz, but it will not exist if starting from the code base).
  • Slight display change
    • The Summary shows the following: (1) for a project, if it used mask_genes or a different min_size then 100000, (2) for an alignment, if the "mummer_path" is set, or "concat" is unset, or the promer/nucmer parameters are changed. Only creates/updates on running the synteny.
  • Small changes
    • symap -v will display important MySQL variables and make suggestions.
    • It does not try to load hidden files.
    • It does not allow "cancel" on remove operations.
    • It displays length of log files (they are appended to, so can become large).
    • The project parameters annot_keywords and annot_kwy_mincount are now being saved to the database so their values are shown correctly in the project parameters window on re-display.
    • A proper message is given when viewSymap tries to read a databases (symap.config) that does not exist.
    • A stacktrace will not be printed to the terminal, only to the error.log (the e.getMessage() sometimes includes a partial stack).
    • Developers: removed more dead code.
Note: this was first released on 27July22, then again on 31July22 with a slight change.

Release: v5.1.0 (12June22)

The documentation has been moved from www.agcol.arizona.edu/software/symap to csoderlund.github.io/SyMAP. Any references to the online documentation within the code has been updated. For pre-v5.1 releases, the www.agcol.arizona.edu/software/symap/docs still exists.

The html Help files can now have URL links in them.

Release: v5.0.8 (1June22)

  1. A new option "Concat" has been added; when this is unchecked, it will not create a concatenation of the first genome to run against all chromosomes of the second, instead it perform a chromosome-by-chromosome search.
  2. MUMmer4 can be used by compiling it on your machine from the /ext directory and adding its path to the "symap.config" file. See Running MUMmer.
  3. When computing alignment&synteny, all stacktrace errors are now written into error.log.
  4. The database name is prefixed to the "logs/LOAD.log" file name.
  5. A new document has been created for when there is problems running MUMmer from SyMAP.
  6. ConvertNCBI script was writing the scaffold sequences to end of the last chromosome (this did not make the annotation or synteny wrong - just extra sequence).

Release: v5.0.7 (4Feb22)

  1. MUMmer has been updated since the last SyMAP release; the code in the /ext/mummer4 directory has been updated for the latest MUMmer. The documentation has been updated accordingly (see Running MUMmer).
  2. ConvertNCBI: (1) Include ncbi_dataset format. (2) Convert hex to character.
  3. Parameters: (1) The Load would not work if any numeric parameter was set to blank. (2) For Alignment&Synteny "Parameters", the Global button was removed as it did not work except under weird conditions.
  4. Help: the viewer Help pages link to the pages at www.agcol.arizona.edu/software/symap. The approach for Mac was becoming obsolete, so it was changed. For Linux, more supported browsers were added.
  5. Improve documentation.
  6. Developers: Removed all Applet code (was disabled previously, but code was in everything).

Release: v5.0.6 (2June20)

  1. Some changes were necessary for SyMAP to run with MySQL v8.
    • The schema needed a change, which will occur on first reading of database.
  2. SyMAP works on MacOS Catalina 10.15
  3. SyMAP can be compiled with Java v14 (some changed were necessary for this).
  4. The Applet has been removed and will no longer be supported. (It is in earlier SyMAP versions, and can be obtained at github).
  5. The built-in MySQL database on Linux is no longer supported.
  6. The Alignment&Synteny has a couple checks for 'ordered_against' before processing: (1) Is the order_against the same as the other selected project. (2) Does the ordered directory already exist.

Release: v5.0.5 (2May20)

  1. Ordering algorithm: The main change is a rewrite of the ordering algorithm, which orders draft contigs against a completed sequence closely related genome. The previous algorithm sometimes would put many of the placed contigs incorrectly into 'chrUNK', though the ordering in the anchored.csv file was correct.
  2. SyMAP Manager:
    • Show #genes and #gaps in summary
    • Add Project: On Linux, the "Add" button would stay disabled, this has been fixed.
  3. Chromosome Explorer: Mouse over a hit and the block number will be shown along with the other information.
  4. SyMAP Query: The "Align Sequence" display now show locations in text box under alignment. The v5.0.4 introduced a bug where the row selections display were not right, which is fixed.
  5. Small changes:
    • Running 'symap -v' will check the MySQL variables needed to run at a reasonable speed, and will output to terminal any variables that need changing.
    • Block View: When viewing the blocks using VNC, the chromosome that the blocks were aligned to was not visible, which has been fixed.
    • All Pairs button: Is only enabled if there is anything that will be executed when selected.
    • Load Project: If a file of 100's of draft sequences are loaded, this was flagged as an error (but still loaded); it no longer called an error, though a message is written to terminal.

Release: v5.0.4 (18Apr20)

  1. SyMAP Query has had major changes:
    1. It is faster, e.g. finding the hits in synteny blocks for 5 plant genomes did take 9m:32s and now takes 0m:09s.
    2. The query page has changed to allow more versatile queries.
    3. The PgeneF is only computed if the "Compute PgeneF" option is checked; otherwise, there is no changes to this function or options.
    4. Some of many minor changes:
      • Selecting a chromosome for a species (or multiples) now only shows hits that involve the selected chromosomes.
      • The #RGN column has been removed and a new "Has Gene" column has been added. Note, all species with a gene overlapping the hit should have 'annotation', but this makes it easier to see in a glance. Plus, if some genes do not have an ID or some other keyword, it can be confusing looking at the annotation.
      • The "Save for Reload" and "Reload Results" have been removed, as it is so much faster, this does not seem necessary anymore.
      • Both the Overview and Table Summary have small changes.
  2. scripts/ConvertNCBI has been updated to also create a gap file.
  3. scripts/symap.sql has been removed and the schema has been incorporated into the Java code.
  4. SyMAP Summary: Displays major parameters for projects that are not loaded.
  5. Chromosome Explorer 2D View: The mouse over provides location information in the "Instruction" box, plus:
    • The following functions can be applied to the selected region of a sequence track: "Zoom All Tracks", "Zoom Selected Tracks", "Align (Max 30000)", "Show Sequence", where the first three are not new, but the labels changed for clarity.
    • The last one, "Show Sequence", has been added as there was no way to see the underlying sequence; it is simply a popup of the selected sequence.
    • The "Align" view has been slightly changed to add comma's in large numbers and the length of the sequence.
  6. The symapApplet.jar was not working in recent releases, which has been fixed.

Release: v5.0.3 (11mar20)

  1. Chromosome Explorer 2D view. The "Annotation Description" can be right-click to view a pop-up of the description; the benefit is that sometimes part of the description is buried, and the pop-up can be copied.
  2. SyMap Queries. (1) The filters set are shown at the top of the table. (2) The "Only Orphan Genes" option can now only be used with the "Annotation Description", "Location", and "Include" options.
  3. If access to MySQL does not work, a message is printed with the Java error message.

Release: v5.0.2 (24feb20)

  1. scripts/ConvertNCBI: (1) Produce a hard-masked genome sequence (input is soft-masked) with -m flag. (2) Includes the mRNA ID in the product attribute. (3) The product attribute has been improved.
  2. SyMAP Bug fix for using chromosome and scaffolds in the same project.
    The following did not work: Say the genomic sequence input was a mix of chromosomes (prefix 'Chr') and scaffolds (prefix 's'), and the Project parameters were prefix 'Chr' and minsize '10000', where the minsize filtered out annotated scaffolds, SyMAP would quit if there was annotation on filtered-out scaffolds; this has been fixed. Also, the documentation was not correct for mixing sequence types (e.g. prefix 'Chr' and 's'); the Project parameter 'grp_prefix' should be set to blank.

Release: v5.0.1 (16feb20)

  • Load Annotation
    • In the Params interface for a project, the user can set what keywords from the column 9 "Attributes" field of the gff file to load; this was not working correctly and has now been fixed.
    • The type (column 3) of mRNA was being loaded as a gene, which causes duplicate records; now mRNA's are ignored.
  • Alignment & Synteny
    • Though it was not possible to make MUMmer v4 part of SyMAP, it is possible to install MUMmer v4 in the /ext/mummer directory and it will work. Instructions are provided in System Guide.
    • The nucmer "--maxmatch" parameter was automatically used for self-synteny when a sequence file was being compared with itself; this would result in an extremely long execution time for large files (e.g. chromosomes). Consequently, this parameter has been removed, but can now be set by the user in the "Parameter" interface.
    • Pair Parameters: (1) The user can set parameters in "Self Args", which are only used when a chromosome sequence is being compared to itself. (2) This interface was very confusing as to when parameters were saved and what would be used; this should be clearer now, and is explained in the "Help" for the Parameter interface.
    • If only part of the alignments finished and it ended pre-maturely, when run again, it will not redo any finished alignments.
  • Internally, goobs of commented out code has been removed.

Go to top

Release v5.0 (22jan20)

New projects cannot be added to v4.2 databases, but they can be queried (e.g. dotplot). The significant change is with the file structure for the data, where the old file structure will not work with this new release as shown further under Directory structure.
  • General
    • Error messages (and stack traces) are written to error.log
    • There was one symap script with various flags, now there is four symap scripts with less flags, as follows:
      ScriptBuildViewHas 3D view
      ./symap YesYesNo
      ./symap3D YesYesYes
      ./viewSymap NoYesNo
      ./viewSymap NoYesYes
    • The symap -c command line parameter can be used to specify a configuration file other than 'symap.config'
    • The Summary list of all selected projects has been enhanced.
  • Query database
    • The "Display for Selected Pair:" popup Summary is now available for self-synteny and FPC-Seq
  • Build database
    • Log files: The log files are no longer in the results directories, instead they are in the top level /logs directory, as follows:
      1. The output from the loads are concatenated to the file /logs/LOAD.log.
      2. The output during alignment is in the /logs/<proj1-to-proj2> directory.
      3. The output to the terminal from the alignment and synteny computation is written in the /logs/<proj1-to-proj2>symap.log file.
  • Load project
    • The input files can be gzipped (end with .gz)
    • The Parameters menu uses the previous location for the file/directory finder.
    • Add ConvertEnsembl.java to convert an Ensembl file to SyMAP annotation file.
  • Alignment & Synteny
    • The trace output to terminal is much more informative.
    • The alignments for self-synteny have been optimized, which also changes the results (e.g. demo_seq now has some tiny blocks were before it had none).
    • The number of CPUs can be set in the interface, or either of the following:
      1. Use the -p command line parameter, e.g. ./symap -p 12.
      2. Set the 'nCPUs' configuration parameter in the symap.config file, e.g. nCPUs=12.
    • Much of the code for building the syntenies has been re-written, but the results are basically the same with a few tiny differences.
  • Directory structure: The following are the directory changes, where they are under /data.
    NewOldDescription
    /seq /pseudo Directory for sequence projects
    /seq/<project>/sequence /pseudo/<project>/sequence/pseudo Default directory for sequences
    /fpc_results /fpc_pseudo Directory of FPC to sequence project results
    /seq_results /seq_pseudo Directory of two sequence project results
    /seq_results/<proj1-to-proj2>/align /pseudo_pseudo/<proj1-to-proj2>/anchor Directory of alignments

Go to top


Release: Jan2018 (build 120)

Release date: 1/16/18 - updated ConvertNCBI.java script.

Release date: 1/10/18

  • A few more improvements on the applet code.
  • Renamed some functions to make them clearer and updated the help.
  • Improved the Help on the parameters page.
  • Fixed a rare memory bug (occured on a self-synteny; requires the database to be re-created).
  • The Image button does not work for Java v9, so a popup tells the user.
  • Added a Java script called ConvertNCBI.java, as documented in NCBI to SyMAP.
Release date: 12/28/17 (build 119)
  • The applet code has been fixed to work better.
  • The FPC to sequence block view had a bug that was fixed (though 'reverse' does not work).
Release date: 12/24/17 (build 118)
  • There was a limit of 75 sequenced regions per genome, which the block view did not work after that.
    The limit has been changed to 150 and a popup message occurs if that limit is passed.
    A script (scripts/lenFasta.pl) is supplied to provided the sorted lengths, where you should set the min_size parameter to the 150th length.
Release date: 11/7/17 (build 117)
  • Fixed problem that prevented symap from running with new JDK v9.
Release date: 1/7/16 (build 116)
  • Fix display sizing errors in circle view display
  • Fix database error for long project names (max project name remains 40)

Go to top

Version 4.2

Release date: 4/23/2014

New Features:

  • New, unified web access applet interface. Now you can present a unified SyMAP web display of some or all of the projects in your database using just one small HTML page (no CGI). Applet interfaces are also available for each of the sub-functions (dotplot, circle view, etc.) enabling customized web displays.

Modified Features:

  • The Explorer has been updated to be more usable for draft projects, i.e. when there are many sequences that have not been ordered and anchored to pseudomolecules. Specifically, it will now show up to 450 sequences, and to decrease clutter it shows only sequences that actually have a synteny block.
  • The prior web system files have been removed, including the cgi and supporting perl modules.
  • The circle view now draws all text labels horizontally, due to rendering problems with rotated text on several platforms (most severely on Mac). The rotated text is still available through a checkbox option. It is no longer necessary to run the perl install script to install web functionality.

Version 4.1

Release date: 2/20/2014

New Features:

  • Publication-quality image saving. A save button has been added to every graphical window allowing saving to a wide variety of formats, including scalable vector formats (svg, eps, pdf) which may be resized clearly to any size. (Many thanks to the developers of the FreeHEP Vector Graphics library, and especially Mark Donszelmann, for making this possible!)
  • Download of block co-ordinates. Choose the species of interest, open the Explorer, and the download button is at the lower left.

Modified Features:

  • Applet is signed with a DigiCert certificate, removing the security blocks which have been increasing in recent Java versions.
  • MUMmer binaries for Mac OSX have been replaced by 64-bit versions. The 32-bit versions which had been supplied could not access more than 2G RAM, preventing alignment of many chromosomes.

Version 4.0

Release date: 7/15/2012

New Features:

  • Project Manager Enhancements
    • Create new projects through the Manager
    • No need to create directories or copy files
    • Alignment & Synteny parameters also available
  • Query Page Enhancements
    • Compute putative gene families across species
    • Many additional filters
    • Create and view Muscle multiple-alignment for selected results
  • Additional Features
    • Script 3track_figure.pl makes 3-genome figure for publication (see online Tour; script added to 4.0 package post-release)

Modified Features:

  • Write contig anchoring information to a file (order_against mode)
  • No longer creates the "_ordered" project output from order_against
  • grp_prefix parameter now optional (but recommended). grp_sort eliminated.
  • Self-alignments improved by using -maxmatch option of MUMmer for individual chromsome self-alignments
  • Top-2 filtering modified to accept also hits within 80% of the 2nd best score, in order to retain gene-family hits and allow identification of families through the query interface.

Version 3.5

Release date: 1/24/12

New Features:

  • Dynamic circular display
    • Circular-style view allowing addition/removal of species and chromosomes
    • Replaces 3D view in the web applet (removing problems of MacOS support for 3D)
    • 3D still available in standalone mode
  • Annotation query interface (Java-based, upgrades the web CGI version)
  • Project parameters editing window
  • Unsigned web applet (no approval popups, if database and webserver have same host)
  • For draft sequence using order_against, create ordered and anchored versions of the project

Modified Features:

  • Added contig-flipping to draft sequence order_against function
  • Added draft sequence ordering demo
  • Improved command-line launch script

Version 3.4

Release date: 12/14/10

Modified Features:

  • Query files catenated before running MUMmer, for substantial increase in speed.
  • Batch alignment buttons added to Project Manager.
  • Improvements to handling of unordered scaffolds, including:
    • Scaffolds can all be in one fasta file.
    • Added min_size parameter, to specify minimum size scaffold to load.
    • Added order_against parameter, to specify reference sequence for ordering scaffolds (the order only affects the dotplot display)
  • Minor changes to CGI page displays to better accomodate unordered.
  • Fix compilation error in 64-bit MUMmer binary, which prevented use of very long pseudomolecules.

Version 3.3

Release date: 10/4/10

Modified Features:

  • Added "no_overlapping_blocks" parameter
  • Added printout of block and synteny anchor tables to "/results" directory
  • Remove "Contig0" (singleton clones) from FPC displays
  • Adjustments to unannotated sequence clustering to obtain larger clusters, not dependent on the order of scanning of MUMmer output files.
  • When clicking a block in the dot plot, 2D view will open with hit filter setting "show only synteny hits"
  • Some bug-fixes to search.cgi

Version 3.2

Release date: 6/9/10

Modified Features:

  • Inversions are shown on the 3D view in green, where un-inverted are shown in red.
  • Anchor chains are not merged.
  • The dot plot background is white and the synteny blocks blue.
  • SyMAP automatically only uses one processor unless specified otherwise with the -p option.
  • The whole genome dot plot is available from the project manager.
  • When running an alignment, a summary is listed to the terminal and to a log file.
  • A lot of error checking has been added. However, every platform can have variations, which we may not have accounted for. So PLEASE, let us know if you have any problems and we will work with you to it.

Version 3.1

Release date: 2/12/10
This release includes major enhancements to the Dot Plot view.

New Features:
  • Multi-genome Dot Plot: multiple species can be displayed on the y-axis against a common reference species.
  • Dot Plot view integrated into 3D viewer, now the species/chromosomes displayed in the Dot Plot can be changed on-the-fly.
  • Dot Plot reference species on the x-axis can be changed dynamically via a new drop-down menu.
  • Main web page redesigned.


Version 3.0

Release date: 1/7/10
This major release includes many new features, in particular the 3D viewer and Project Manager.

New Features in User Interface:
  • Multi-genome 3D viewer.
  • Circular CGI view.
  • Gene search CGI.
  • Support for sequence-to-sequence alignment.
  • Option to flip sequence.
  • Zoom into sequence or hit regions using the mouse scroll wheel.
  • Three-track CGI blocks view.
  • Improved annotation display and addition of URL-embedding capability.
New Features in Back-End Processing:
  • Completely rewritten in Java with major performance improvement.
  • Addition of Project Manager GUI to simplify and automate back-end processing, sequence alignment, and synteny analysis.
  • Support for sequence-to-sequence alignment.
  • Hits are now clustered based on gene annotation and drawn as "ribbons" to indicate locations of sub-blocks.

Version 2.0

Release date: 5/18/07
This release includes many new features for the alignment view. The dot plot view was not changed.

New Features Block view (CGI):
  • View reverse option: reverse which of the two genomes is the reference genome.
  • Chromosome view: the blocks are color coded to indicate which chromosome they are from. Also, the number of anchors is displayed over each block.
New Features on Java interface:
  • Filters:
    • For a given track (Block, Hit, Sequence) hover the mouse over the track and right click. A menu of the most common filters is shown; this is much easier than opening up the Filter window.
    • From this window, you can go directly to the help for the given track. There is also a 'Navigation Help', as there are many ways to resize, scroll, etc.
  • Sequence track:
    • Show hit score bar is a new filter that shows the hits down the length of the sequence as a histogram showing how strong each hit is.
    • Show Hit Score Value is a new filter that shows the %identity. Note that the sequence graphic is twice as wide to show the score bar and score value.
    • The sequence track can be scrolled by moving the mouse wheel when the mouse is position over the sequence track. This retains the same zoom and filters, but moves up or down the lenght of the sequence (i.e. pseudomolecule).
    • The gene annotation now shows the intron/exon structure.
  • Hit track:
    • Show Only Gene Hits (contained) is a new filter that turns off the display of all hits unless they are contained in an annotated gene.
    • Show Only Gene Hits (overlap) is a new filter that turns off the display of all hits unless they overlap an annotated gene (but is not contained in a gene).
    • Show Only Non-Gene is a new filter that turns off the display of all hits unless they are NOT overlapping or contained in a gene.
    • Hit lines are highlighted in red when mouse positioned over a hit.
  • Block track:
    • When hovering over a contig, the contig information at the bottom of the display includes all the chromosomes that the contig hits.
  • Bug Fixes:
    • Genes drawn incorrectly when sequence range within gene boundary.
    • Fixed bug with pseudomolecule filter.
    • Certain queries were failing when used with MySQL 5.0.26 due to comma precedence change.
New Features in Back-End Processing:
  • Support for mixed letter/number chromosome names and arbitrary ordering (see example in demo-fpc params file)
  • Pseudomolecule sequences loaded in 1Mb chunks, removing database problems with large text fields
  • Filtering optimized in anchors.pl to reduce memory use in loading large blat outputs

Version 1.0

Release date: 9/1/06
First full public release.

Go to top

Email Comments To: symap@agcol.arizona.edu