In the sections following you will find instructions on how to adapt the programs to different computers and compilers. The programs should compile without alteration on most versions of C. They use the "malloc" library or "calloc" function to allocate memory so that the upper limits on how many species or how many sites or characters they can run is set by the system memory available to that memory-allocation function.
In the document file for each program, I have supplied a small input example, and the output it produces, to help you check whether the programs are running properly.
Most of the programs read their data from a file called "infile" and write their output to a file called "outfile" and a tree file to a file "treefile". If "infile" does not exist the program will prompt you for its name.
If you cannot do this, you may be able to transfer the entire package, in
the form of self-extracting archives (which is one of the ways we distribute it
for microcomputers) to your system using a terminal program with file transfer
capabilities. Some users are sufficiently terrified of this prospect that they
prefer to mail us diskettes and wait for several weeks. But if your
institution has an Internet connection it is much faster to do it that way. If
you have a serial port to which a modem can be hooked, you can get a terminal
program and do the transfers yourself. For most microcomputer systems,
public-domain or shareware terminal programs are available, such as the
widely-distributed KERMIT and MODEM families of programs. Most university
computer centers have communications programs (KERMIT or XMODEM) to "talk" to
KERMIT, MODEM, or PC-TALK and transfer files to and from it.
Thus, if you cannot get from me a disk format readable by your machine,
you can:
Now we turn to particular C compilers and describe particular problems
that may be encountered.
To compile individual programs without using the makefile, you need to do
the following. For a non-graphics program use the following command (DOS> is
the PCDOS prompt, so you do not type it):
Compiling the programs
Many machines that have C compilers, particularly Unix systems, have a
utility called "make" available that considerably simplifies the process of
compiling these programs. I will first discuss how to compile these programs
with "make" and then, after a digression on how to move PHYLIP to a
microcomputer, discuss for different individual systems how to compile the
programs. As we shall see below, for some DOS and Macintosh compilers one
cannot simply use "make" and the standard Makefile.
Using "make"
If your machine has "make" you can place all the programs for the package,
together with the file "Makefile" and the header files "phylip.h", and
"drawgraphics.h", in one directory. The Makefile and header files are
constructed to detect, for many varieties of C, which it is dealing with, and
inform the programs accordingly so that they can (by using "#ifdef") adapt to
the idiosyncracies of the compiler.
To compile all the programs just type: make all
To compile just one program, such as DNAML, type: make dnaml
After a time the compiler will finish compiling. The names of the
executables will be the same as the names of the C programs, but without the
".c" suffix. Thus dnaml.c compiles to make an executable called "dnaml". If
object modules ending in ".o" are found in the directory after compilation they
can be removed if you need space.
Getting PHYLIP onto your microcomputer
C is widely available on microcomputers, and in any case we also
distribute executable versions for PCDOS, 386 PCDOS, and Macintosh systems.
Your institution may have an Internet connection, and if so there is probably a
PCDOS system or a Macintosh somewhere connected directly to it. Using that
machine you could download the executables and put them directly into diskette
for transfer to your own machine. You can also get the source code,
documentation, and executables by sending me the appropriate number of
diskettes (see the general information at the start of this document).
If you cannot read the diskette formats that I can write, and if you
absolutely INSIST that I distribute the package in this format, please send me
the computer and thirteen diskettes. I will promptly write the diskettes and
return them (but of course I will keep your computer).
Microsoft Quick C and Microsoft C
These comments apply to Microsoft Quick C but may also work with Microsoft
C. A Makefile for Microsoft Quick C is included with the source code. It is
called "Makefile.qc". If you copy it and call the copy "Makefile" (making sure
to first save the generic Makefile that comes with this package under some name
such as Makefile.old), you should be able to use "make" as described above,
except that it is called "nmake". Note that the command you must use to
compile (for example) DNAPARS is "nmake dnapars.exe", not "nmake dnapars", as
the program that results is to be called "dnapars.exe" and the Quick C Makefile
is set up that way.
DOS> qcl /AH /F 4000 /FPi [source files]
If the program you are trying to compile is a 1-part source (for example,
neighbor only has one part, neighbor.c) you should replace "[source files]"
with "neighbor.c". So the command would be:
DOS> qcl /AH /F 4000 /FPi neighbor.c
If the program you are trying to compile is a 2-part source (for example, mix
has two parts, mix.c and mix2.c) you can replace [source files] with both of
the source files. Make sure that the first source file in the list has the
same name as the executable file you want. i.e. use mix.c mix2.c and not the
other way around. If you reorder them, the executable file will be called
"MIX2.EXE". For mix, the command would be:
DOS> qcl /AH /F 4000 /FPi mix.c mix2.c
to compile a graphics program (i.e. drawgram, drawtree) under quick c without
using the makefile, use one of the following commands:
for DRAWGRAM:
DOS> qcl /AH /F 4000 /FPi drawgram.c drawgraphics.c graphics.lib [for drawgram]
for DRAWTREE:
DOS> qcl /AH /F 4000 /FPi drawtree.c drawgraphics.c graphics.lib [for drawtree]
Turbo C++ for PCDOS
The following instructions are for Turbo C++ but may also work for Turbo C and
for Borland C, perhaps with slight modifications. Under normal situations you
can use the makefile. The makefile for Turbo C++ is included in the package as
"Makefile.tc". Copy it and call the copy "Makefile" (it would be wise the first
rename the original "Makefile" to "Makefile.old"). Then to compile, say,
DNAPARS, just type:
make dnapars.exeHowever, if for some reason you want to do it by hand, follow the following steps:
For the non-graphical programs (all those other than DRAWGRAM and DRAWTREE):
to compile dnapars.c type the following (DOS> is the PCDOS prompt)
DOS> tcc -mh dnapars.cIf the source file is sufficiently large to require two sources (for example, dnaml.c and dnaml2.c), you will need to use both dnaml.c and dnaml2.c.
Examples:
DOS> tcc -mh dnaml.c dnaml2.c DOS> tcc -mh neighbor.cIf you would like to use the program under the TD debugger, you should add a "-v" flag as a compiler option:
DOS> tcc -mh -v restml.c restml2.cFor the graphical programs (DRAWGRAM and DRAWTREE):
First you need to build the "BGI" drivers. The BGI drivers are included with your TURBOC compiler, and should be in the "BGI" directory (this is a subdirectory of the main turboc directory). To do this you need to use the "bgiobj" program, also in the BGI directory. The current version of PHYLIP supports the EGA/VGA, CGA, and hercules drivers. If you have modified the sources to take advantage of other drivers, you will have to include those as well.
To build the BGI drivers:
DOS> cd \tc\bgi [this should be replaced with whatever your turboc dir is] DOS> BGIOBJ EGAVGA DOS> BGIOBJ CGA DOS> BGIOBJ HERCthis generates the files "EGAVGA.OBJ", "CGA.OBJ", and "HERC.OBJ" in the current directory. you want to copy this into your main source directory. (assume this is \phylip)
DOS> CP EGAVGA.OBJ \phylip [replace this with your source directory] DOS> CP CGA.OBJ \phylip DOS> CP HERC.OBJ \phylipTo compile the program, cd back to your source directory. You want to compile each source file, plus a shared graphics file called "drawgraphics.c". You also want to link it to the newly created BGI object files and to the graphics library.
Examples:
DOS> tcc -mh drawgram.c drawgraphics.c herc.obj egavga.obj cga.obj graphics.lib DOS> tcc -mh drawtree.c drawgraphics.c herc.obj egavga.obj cga.obj graphics.lib(to compile drawgram and drawtree, respectively)
If you want to compile for the TD debugger, add the -v flag as above.
Watcom C/386 is a very flexible compiler which can generate executable
programs for many different environments. Following are instructions for using
Watcom C/386 to compile for DOS using the DOS/4GW DOS extender (included with
the Watcom distribution) and for Microsoft windows.
DOS/4GW:
to compile a program under watcom C/386 for the DOS/4GW dos extender use
the following (the "DOS>" is the PCDOS prompt, not something you type):
For Windows:
to compile a program under watcom C/386 for windows use the following:
once you have compiled the windows program you are not quite ready to run the
program under windows. The final step is to link it with the "windows
supervisor". to do this do the following:
CAVEATS:
1. Make sure that when you use wbind that \watcom\binw is somewhere in
your path. if it is not, you may have to tell wbind explicitly where
the windows supervisor file is, as in the following example:
2. The draw programs (drawgram, drawtree) currently do not compile
under windows. Compile them for DOS/4GW and use it in a dos shell under
windows
In running the programs, you may sometimes want to put them in background
so you can proceed with other work. On systems with a windowing environment
they can be put in their own window, and commands like "nice" used to make them
have lower priority so that they do not interfere with interactive applications
in other windows. If there is no windowing environment, you will want to use
an ampersand ("&") after the command file name when invoking it to put the job
in the background. You will have to put all the responses to the interactive
menu of the program into a file and tell the background job to take its input
from that file.
For example: suppose you want to run DNAPARS in a background, taking its
input data from a file called sequences.dat, putting its interactive output to
file called "screenout", and using a file called "input" as the place to store
the interactive input. The file "input" need only contain two lines:
To run the program in background, you would simply give the command:
If you wanted to give the program lower priority, so that it would not
interfere with other work, and you have Berkeley Unix type job control
facilities in your Unix, you can use the "nice" command:
You may also want to explore putting the interactive output into the null
file "/dev/null" so as to not be bothered with it (but then you cannot look at
it to see why something went wrong. If you have problems with creating output
files that are too large, you may want to explore carefully the turning off of
options in the programs you run.
If you are doing several runs in one, as for example when you do a
bootstrap analysis using SEQBOOT, DNAPARS (say), and CONSENSE, you can use an
editor to create a "batch file" with these commands:
Some of the programs come in several pieces that have to be compiled and
linked together. For example, DNAML comes in two pieces, dnaml.c and dnaml2.c.
To compile them and link the resulting object files together into one
executable, use the commands:
Waterloo C/386
Waterloo C/386 is the compiler we use to create the 386 PCDOS and 386
Windows versions of the executables. It has a "make" capability called
"wmake". We have had problems using this so the instructions here are for
individually compiling programs without wmake.
DOS> wcl386 /l=dos4gw [source files]
If the program you are trying to compile is a 1-part source (for example,
neighbor only has one part, neighbor.c) you can replace [source files] with
"neighbor.c". So the command would be:
DOS> wcl386 /l=dos4gw neighbor.c
If the program you are trying to compile is a 2-part source (for example, mix
has two parts, mix.c and mix2.c) you can replace [source files] with both of
the source files. Make sure that the first source file in the list has the
same name as the executable file you want. i.e. use mix.c mix2.c and not the
other way around. If you reorder them, the executable file will be called
"MIX2.EXE". For mix, the command would be:
DOS> wcl386 /l=dos4gw mix.c mix2.c
The resultant executable file will take advantage of your system's extended
memory and will not be limited to using only the first 640K. However, it needs
the file "dos4gw.exe" in order to run. If you want to be able to use the
program generated, make sure that this program is somewhere in your path. (To
ensure this you can copy the program into the directory where the compiled
program resides). This "dos extender" is bundled with the Watcom C/386
compiler and is freely redistributable.
DOS> wcl386 /l=win386 /zw [source files]
again, replace [source files] with either the complete program (ie neighbor.c)
or both parts of the program (ie mix.c mix2.c).
DOS> wbind [program] -n
i.e.:
DOS> wbind mix -n
this program will generate [programname].exe. this application will be
runnable under windows.
DOS> wbind mix -n -s c:\watcom\binw\win386.ext
replace the c:\watcom\win386.ext with the full path of win386.ext.
Think C for Macintosh
For Symantec's Think C compiler (formerly called Lightspeed C) a "make"
utility is not available. Thus you cannot use the Makefile but must compile
the programs individually. Here are the steps you should follow to compile a
typical program.
Although this is more tedious than using a Makefile, Think C works very well
with the PHYLIP programs and is the compiler we use for creating the Macintosh
executables.
Unix
I have already mentioned that under Unix you can use the "make" command to
compile programs. This works on all Unix systems. To compile an individual
program like dnapars.c you can give the command "make dnapars" or alternatively
"cc dnapars.c -lm". When compiling programs that come in two parts, such as
dnaml.c and dnaml2.c, you will have to issue three commands, two compile
commands and one link command:
cc -C dnaml.c
cc -C dnaml2.c
cc dnaml.o dnaml2.o -lm -o dnaml
where the first two commands produced the object modules dnaml.o and dnaml2.o
and the third command links them together into an executable that is called
dnaml.
sequences.dat
Y
which is what you would have typed to run the program interactively, in
response to the program's request for an input file name if it did not find a
file named "infile", in in response the the menu.
dnapars < input > screenout &
which runs the program with input responses coming from "input" and interactive
output being put into file "screenout". The usual output file and tree file
will also be created by this run (keep that in mind as if you run any other
PHYLIP program from the same directory while this one is running in background
you may overwrite the output file from one program with that from the other!).
nice +10 dnapars < input > screenout &
which lowers the priority of the run. To also time the run and put the timing
at the end of "screenout", you can do this:
nice +10 ( time dnapars < input ) >& screenout &
which I will not attempt to explain.
seqboot < input1 > screenout
mv outfile infile
dnapars < input2 >> screenout
mv treefile infile
consense < input3 >> screenout
and then take the file (say "foofile") containing these commands and give it
execute permission by using the command "chmod +x foofile" followed by the
command "rehash". Then the job that foofile describes can be run as a single
job in background by giving the command "foofile &". Note that you must also
have the interactive input commands for SEQBOOT (including the random number
seed), DNAPARS, and CONSENSE in the separate files "input1", "input2", and
"input3". With Berkeley-style job control the "nice" command can be used
within the batch file "foofile" before each program name to reduce the priority
with which the programs run.
VMS VAX systems
On the VMS operating system with DEC VAX VMS C the programs will compile
without alteration, except that we have to add some extra routines because the
"%hd" format in printf and fprintf does not work. These extra routines are in
the file VAXFIX.C. The commands for compiling a typical program (DNAPARS) are:
$ DEFINE LNK$LIBRARY SYS$LIBRARY:VAXCRTL
$ CC DNAPARS.C
$ CC VAXFIX.C
$ LINK DNAPARS,VAXFIX
Once you use this "$ DEFINE" statement during a given interactive session, you
need not repeat it again as the symbol "LNK$LIBRARY" is thereafter properly
defined. The compilation process leaves a file DNAPARS.OBJ in your directory:
this can be discarded. The executable program is named DNAPARS.EXE. To run
the program one then uses the command:
$ R DNAPARS
The compiler defaults to the filenames "INFILE.", "OUTFILE.", and
"TREEFILE.". If the input file "INFILE." does not exist the program will
prompt you to type in its name. Note that some commands on VMS such as "TYPE
OUTFILE" will fail because the name of the file that it will attempt to type
out will be not "OUTFILE." but "OUTFILE.LIS". To get it to type the write file
you would have to instead issue the command "TYPE OUTFILE.".
$ DEFINE LNK$LIBRARY SYS$LIBRARY:VAXCRTL
$ CC DNAML.C
$ CC DNAML2.C
$ CC VAXFIX.C
$ LINK DNAML,DNAML2,VAXFIX
This will make an executable called DNAML.EXE plus two ".OBJ" files that can be
discarded. Note that when a LINK command is issued the name of the first file
(in this case DNAML) becomes the name of the ".EXE" file that is produced by
the linker.
To make it easier to compile all of the programs on VMS systems, we have supplied a command file, "compile.com" that will do this. If you install that file and issue the command "@compile" it will compile all of the programs. However it is recommended that you also know how to recompile individual programs so that they can be altered to your purposes.
The programs DRAWGRAM and DRAWTREE both use routines in drawgraphics.c. To compile (for example) DRAWGRAM, use:
$ DEFINE LNK$LIBRARY SYS$LIBRARY:VAXCRTL $ CC DRAWGRAPHICS.C $ CC DRAWGRAM.C $ CC VAXFIX.C $ LINK DRAWGRAM,DRAWGRAPHICS,VAXFIXwhich will create a file called DRAWGRAM.EXE, plus two ".OBJ" files. When you run DRAWGRAM you must have a font file present in your directory, as well as the tree file. If they are not found under their default names the program will prompt you for these. When you are using the interactive previewing feature of DRAWGRAM (or DRAWTREE) on a Tektronix or DEC ReGIS compatible terminal, you will want before running the program to have issued the command:
$ SET TERM/NOWRAP/ESCAPEso that you do not run into trouble from the VMS line length limit of 255 characters or the filtering of escape characters.
However, although the underlying algorithms of most programs, which treat
sites independently, should be amenable to vector processors, there are details
of the code which might best be changed. In particular within the innermost
loops of the programs there are often scalar quantities that are used for
temporary bookkeeping. These quantities, such as sum1, sum2, zz, z1, yy, y1,
aa, bb, cc, sum, and denom in procedure makenewv of DNAML (and similar
quantities in procedure nuview) are there to minimize the number of array
references. For vectorizing compilers such as the Cray compilers it will be
better to replace them by arrays so that processing can occur simultaneously.
Because IBM is IBM, it tried to impose the EBCDIC character code on the world.
There are good arguments for and against EBCDIC; in any case, the ASCII (or
ISO) code is winning out. I have chosen to distribute PHYLIP in the ASCII
character code, as more likely to be readable on more machines. Some
characters in ASCII have no equivalent in EBCDIC and get arbitrarily changed
when my ASCII files are read into an EBCDIC machine. You may find some
characters which look strange when viewed on a 3270 terminal on a CMS system,
but we have found none that cause trouble for the compiler.
Andrew Keeffe was asked to investigate how to compile the C version of
PHYLIP on our IBM 3090 system, and here is what he has found.
These are the procedures for compiling the phylip package in C on an IBM
mainframe.
These instructions were developed using IBM C/370 on an IBM 3090 running
VM/XA CMS 5.6 Service Level 201.
If you fetch PHYLIP directly as an ftp binary transfer, getting a
compressed tar archive file, as available from our machine, we do not know
whether there is an "uncompress" and a "tar" utility available on CMS to extact
the files from the archive and translate them from ASCII to EBCDIC. You should
ask your computer consultants about that. Alternatively, you could fetch the
files to a PCDOS or Unix machine, extract the archives there, and then move the
resulting text files for the source code and documentation to the CMS system.
If you that, after establishing the connection between the IBM and the other
host, type will translate the text files properly.
CMS prefers the names of files to have a minimum of two parts, called the
filename (abbreviated fn) and the filetype (abbreviated ft), separated by a
space. We have chosen "data" as the filetype, so that "infile" becomes "infile
data", "outfile" becomes "outfile data" and so forth.
All commands that you give to the host are shown in UPPER CASE. You can
type them in upper or lower case; CMS does not care.
Before compiling, give these commands to CMS:
To compile a single program, such as dnapars.c:
Cray
A number of people (F. James Rohlf, Kent Fiala, Shan Duncan, and Ron
DeBry), succeeded in various ways in adapting the Pascal version of PHYLIP to
several models of Crays. Recently Cray has been adopting Unicos, a Unix clone,
as the operating system for its machines, and this means the Unix instructions
should work for compiling the programs on Crays.
IBM Mainframes running CMS
The following information applies not only to IBM mainframes, but to IBM-
compatible mainframes such as Amdahls, Fujitsu, Hitachis, and ICLs when they
run IBM operating systems or IBM-compatible operating systems. It does not
apply to IBM mainframes running AIX (IBM's version of Unix) as for those one
can simply use the Unix instructions above without modification.
SETUP C370
GLOBAL TXTLIB EDCBASE IBMLIB
It would make sense to put these commands in your profile exec until the
compiling and linking is complete.
CC DNAPARS
If there are no errors, the compiler will produce a file with the same filename
and a filetype of 'text', DNAPARS TEXT in this case. Now give these commands:
LOAD DNAPARS
GENMOD DNAPARS
The genmod command generates an executable module file (DNAPARS MODULE) which
may be invoked by typing its name on the command line. Use this procedure to
compile all of the phylip programs except dnaml, dnamlk, restml, drawgram, and
drawtree.
The source files for dnaml, dnamlk, and restml have been split into two parts. To compile one of these programs, give these commands:
CC DNAML CC DNAML2 LOAD DNAML DNAML2 GENMOD DNAMLProceed similarly for dnamlk and restml.
The draw programs, drawgram and drawtree, both depend on common code which is stored in drawgraphics.c and drawgraphics.h. These names will be truncated to DRAWGRAP C and DRAWGRAP H on the CMS system. The contents of the files are not affected.
Compile the drawgraphics code:
CC DRAWGRAPCompile and link the draw programs:
CC DRAWGRAM LOAD DRAWGRAM DRAWGRAP GENMOD DRAWGRAM CC DRAWTREE LOAD DRAWTREE DRAWGRAP GENMOD DRAWTREEIf you are having trouble getting the programs running on your machine, contact me. If I can't help, I can at least find out whether there is anyone else who has adapted them to the same machine and put you in touch with them.
Other Computer Systems
As you can see from the variety of different systems on which these
programs have been successfully run, there are no serious incompatibility
problems with most computer systems. PHYLIP in various past Pascal versions
has also been compiled on 8080 and Z80 C/M Systems, Apple II systems running
UCSD Pascal, a variety of minicomputer systems such as DEC PDP-11's and HP
1000's, CDC Cyber systems, and so on. We hope gradually to accumulate
experience on a wider variety of C compilers. If you succeed in compiling the
C version of PHYLIP on a different machine or a different compiler,, I would
like to hear the details so that I can include the instructions in a future
version of this manual.
Back to the main PHYLIP page
Back to the SEQNET home page