1.  Download compiled VP1 and LT sequence sets as text (.txt) files.  Additional sequences of interest can be pasted using FastA format.

 

2.  Go to http://phylogeny.lirmm.fr/

 

3.  Choose the “One Click” link.

 

4.  Give the analysis a memorable name.  Paste sequences into the text window.

 

5.  De-select the box for “Use the Gblocks program.”  In my experience, the Gblocks module is not useful for analyses of polyomaviruses.

 

6.  Provide your email address so you’ll be notified when the run is done.  Click submit.

7.  When the analysis is complete, click the “Tree in Newick format” link.  Save as a text file (in most web browsers this can be done by right-clicking inside the window).

 

8.  Download Figtree software http://tree.bio.ed.ac.uk/software/figtree/

 

9.  Open the Newick text file using the File -> Open menu within Figtree.  When prompted, change the name “label” to “bootstrap.”  Bootstrap values can be found in the pull-down menu under the Figtree tab “Node Labels.”

 

10.  Play with buttons to find the tree style you like best.  Save the file with the Figtree-specific suffix “.tre”

 

11.  Under FigTree’s File menu, select “Export PDF.”  Polish any kludgy-looking text (e.g., “_” character in virus names) using Adobe Illustrator or other PDF editing software.  Alternatively, the tree can be exported as an SVG file and can then be polished using software such as Inkscape.

 

Notes:

•Our lab uses MacVector software for maintaining and annotating sequence files.  We’re fortunate that the NCI has a site license that makes MacVector available to intramural investigators.  Individual MacVector licenses are expensive.  However, MacVector offers student pricing at attractive rates.  If sequence analysis software isn’t available, ordinary text editing programs can be used to manually manipulate the posted text files.

 

•For my own purposes I usually do quick-and-dirty alignments using either Muscle or ClustalW within MacVector, then use the alignments to construct a neighbor-joining tree.  These are old and outdated methods, but they nevertheless seem to produce trees that are very similar to the more sophisticated Phylogeny.fr outputs.

 

•Bootstrap values indicate the reproducibility of the observed clade.  Numbers closer to 1 indicate higher confidence.

 

•If there is some reason to believe a viral species represents the “ancestral” form of a virus, it is legitimate to “root” the tree on that species.  Since this assumption is rarely available to virologists, “midpoint rooting” (in Phylogeny.fr or under the Tree menu in FigTree) is the safest approach.  It’s fun to look at how things shift around when the tree is re-rooted (button near the top left of the FigTree window) on different species.  Just beware that phylogenetics experts may throw rotten eggs at you for doing it.

 

•I personally find the old fashioned radial trees (righthand button under FigTree “Layout”) the most intuitive display.  The only counter-intuitive aspect is the idea that it doesn’t matter how close two viruses are in space – all that matters is how long of a distance you would go if you were walking along the lines connecting the viruses.  But once you have that idea in mind I think it’s much easier to see at a glance who’s related to whom using radial trees.

Last updated by Buck, Christopher (NIH/NCI) [E] on Apr 13, 2020