README

  This is an updated version of a program that analyzes relative risk
using tree-based methods. This program is free to use and distribute
provided that (a) I bear no responsibility as the consequence of using
this program although I welcome collaborations of using my program for
data analysis; (b) the users must acknowledge the use of this RTREE
program, and it is important to distinguish this RTREE program from
any other commerical programs; and (c) the users are
recommended to cite the following two references in their publications:

(i) H.P. Zhang and M. Bracken. (1995) Tree-based risk factor analysis
of preterm delivery and small-for-gestational-age birth.  American
Journal of Epidemiology, 141:70--78.

(ii) H.P. Zhang, T. Holford, and M. Bracken. (1996) A tree-based
method in prospective studies. Statistics in Medicine, 15:37--50.

Thanks.

Heping Zhang (December 7, 2000)

Introduction to the files in this directory:

rtree*:  These are the executable codes compiled on various systems.
         Save it as rcart and simply type rcart at your command line
         and follow the instruction. Or you can type in the datafile 
         name as a parameter, for example, rcart d053095. Assume the 
         the missing values are either "." or "NA". 

         The code was compiled using GCC on unix systems, Borland
         C++ 5.0 on Windows95, and cc on a DIGITAL Alpha 4/275 with 
         OSF/1 V3.2B.

         The Solaris2.6 and Windows95 are most updated versions.

         If you have a system different from these, please send e-mails 
         to
                    heping.zhang@yale.edu

sample.dat: This is a sample data file. The first row indicates the 
         type of covariates. "d" implies deleting that covariate from the
         analysis; "o" implies a continuous or an ordinal covariate; "n" 
         implies a nominal covariate; "r" implies the outcome.

sample.inf:   The output file. The file can be read as follows:
         Column 1: node number. Node 1 is the root node.
         Column 2: number of subjects in the node, e.g., 2418 subjects
                   in node 2.
         Column 3: left daughter node, e.g. node 2 is the left daughter
                   node of node 1.
         Column 4: right daughter node, e.g. node 3 is the right daughter
                   node of node 1.
         Column 5: The splitting variables. For example, variable 1 splits
                   node 1.
         Column 6: The splitting value corresponds to the splitting 
                   variable.

sample.ps:  A postscript file for the final tree that can be viewed or
         printed.