FITCH -- Fitch-Margoliash and Least-Squares Distance Methods

version 3.5c

DESCRIPTION

This program carries out Fitch-Margoliash, Least Squares, and a number of similar methods as described in the documentation file for distance methods.

The options for FITCH are selected through the menu, which looks like this:

Fitch-Margoliash method version 3.5c

Settings for this run:
  U                 Search for best tree?  Yes
  P                                Power?  2.00000
  -      Negative branch lengths allowed?  No
  O                        Outgroup root?  No, use as outgroup species  1
  L         Lower-triangular data matrix?  No
  R         Upper-triangular data matrix?  No
  S                        Subreplicates?  No
  G                Global rearrangements?  No
  J     Randomize input order of species?  No. Use input order
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, VT52, ANSI)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

Most of the input options (U, P, -, O, L, R, S, J, and M) are as given in that file, and their input format is the same as given there. The U (User Tree) option has one additional feature when the N (Lengths) option is used. This menu option will appear only if the U (User Tree) option is selected. If N (Lengths) is set to "Yes" then if any branch in the user tree has a branch length, that branch will not have its length iterated. Thus you can prevent all branches from having their lengths changed by giving them all lengths in the user tree, or hold only one length unchanged by giving only that branch a length (such as, for example, 0.00). You may find program RETREE useful for adding and removing branch lengths from a tree. This option can also be used to compute the Average Percent Standard Deviation for a tree obtained from NEIGHBOR, for comparison with trees obtained by FITCH or KITSCH.

Another input option available in FITCH that is not available in KITSCH or NEIGHBOR is the G (Global) option. G is the Global search option. This causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program. It is not an option in KITSCH because it is the default and is always in force there. The O (Outgroup) option is described in the main documentation file of this package. The O option has no effect if the tree is a user-defined tree (if the U option is in effect). The U (User Tree) option requires an unrooted tree; that is, it require that the tree have a trifurcation at its base:

     ((A,B),C,(D,E));

The output consists of an unrooted tree and the lengths of the interior segments. The sum of squares is printed out, and if P = 2.0 Fitch and Margoliash's "average percent standard deviation" is also computed and printed out. This is the sum of squares, divided by N-2, and then square-rooted and then multiplied by 100 (n is the number of species on the tree):

                          1/2
     APSD = ( SSQ / (N-2) )    x 100.

where N is the total number of off-diagonal distance measurements that are in the (square) distance matrix. If the S (subreplication) option is in force it is instead the sum of the numbers of replicates in all the non-diagonal cells of the distance matrix. But if the L or R option is also in effect, so that the distance matrix read in is lower- or upper-triangular, then the sum of replicates is only over those cells actually read in. If S is not in force, the number of replicates in each cell is assumed to be 1, so that N is n(n-1), where n is the number of species. The APSD gives an indication of the average percentage error. The number of trees examined is also printed out.

The constants available for modification at the beginning of the program are: "smoothings", which gives the number of passes through the algorithm which adjusts the lengths of the segments of the tree so as to minimize the sum of squares, "namelength", which gives the length of a species name, and "epsilon", which defines a small quantity needed in some of the calculations. There is no feature saving multiply trees tied for best, partly because we do not expect exact ties except in cases where the branch lengths make the nature of the tie obvious, as when a branch is of zero length.

The algorithm can be slow. As the number of species rises, so does the number of distances from each species to the others. The speed of this algorithm will thus rise as the fourth power of the number of species, rather than as the third power as do most of the others. Hence it is expected to get very slow as the number of species is made larger.

TEST DATA SET


    5
Alpha      0.000 1.000 2.000 3.000 3.000
Beta       1.000 0.000 2.000 3.000 3.000
Gamma      2.000 2.000 0.000 3.000 3.000
Delta      3.000 3.000 3.000 0.000 1.000
Epsilon    3.000 3.000 3.000 1.000 0.000

OUTPUT FROM TEST DATA SET (with all numerical options on)

5 Populations Fitch-Margoliash method version 3.5c __ __ 2 (Obs - Exp) Sum of squares = /_ /_ ------------ 2 i j Obs Negative branch lengths not allowed Name Distances ---- --------- Alpha 0.00000 1.00000 2.00000 3.00000 3.00000 Beta 1.00000 0.00000 2.00000 3.00000 3.00000 Gamma 2.00000 2.00000 0.00000 3.00000 3.00000 Delta 3.00000 3.00000 3.00000 0.00000 1.00000 Epsilon 3.00000 3.00000 3.00000 1.00000 0.00000

+---------Beta ! ! +---------Epsilon ! +-----------------------------3 --1---------2 +---------Delta ! ! ! +-------------------Gamma ! +---------Alpha remember: this is an unrooted tree! Sum of squares = 0.00000 Average percent standard deviation = 0.00000 examined 15 trees Between And Length ------- --- ------ 1 Beta 0.50000 1 2 0.50000 2 3 1.50000 3 Epsilon 0.50000 3 Delta 0.50000 2 Gamma 1.00000 1 Alpha 0.50000

Back to the main PHYLIP page

Back to the SEQNET home page

Maintained 15 Jul 1996 -- by Martin Hilbers(e-mail:M.P.Hilbers@dl.ac.uk)

FITCH -- Fitch-Margoliash and Least-Squares Distance Methods

CONTENTS:

DESCRIPTION

TEST DATA SET

OUTPUT FROM TEST DATA SET (with all numerical options on)