README
This is an updated version of a program that analyzes relative risk
using tree-based methods. This program is free to use and distribute
provided that (a) I bear no responsibility as the consequence of using
this program although I welcome collaborations of using my program for
data analysis; (b) the users must acknowledge the use of this RTREE
program, and it is important to distinguish this RTREE program from
any other commerical programs; and (c) the users are
recommended to cite the following two references in their publications:
(i) H.P. Zhang and M. Bracken. (1995) Tree-based risk factor analysis
of preterm delivery and small-for-gestational-age birth. American
Journal of Epidemiology, 141:70--78.
(ii) H.P. Zhang, T. Holford, and M. Bracken. (1996) A tree-based
method in prospective studies. Statistics in Medicine, 15:37--50.
Thanks.
Heping Zhang (December 7, 2000)
Introduction to the files in this directory:
rtree*: These are the executable codes compiled on various systems.
Save it as rcart and simply type rcart at your command line
and follow the instruction. Or you can type in the datafile
name as a parameter, for example, rcart d053095. Assume the
the missing values are either "." or "NA".
The code was compiled using GCC on unix systems, Borland
C++ 5.0 on Windows95, and cc on a DIGITAL Alpha 4/275 with
OSF/1 V3.2B.
The Solaris2.6 and Windows95 are most updated versions.
If you have a system different from these, please send e-mails
to
heping.zhang@yale.edu
sample.dat: This is a sample data file. The first row indicates the
type of covariates. "d" implies deleting that covariate from the
analysis; "o" implies a continuous or an ordinal covariate; "n"
implies a nominal covariate; "r" implies the outcome.
sample.inf: The output file. The file can be read as follows:
Column 1: node number. Node 1 is the root node.
Column 2: number of subjects in the node, e.g., 2418 subjects
in node 2.
Column 3: left daughter node, e.g. node 2 is the left daughter
node of node 1.
Column 4: right daughter node, e.g. node 3 is the right daughter
node of node 1.
Column 5: The splitting variables. For example, variable 1 splits
node 1.
Column 6: The splitting value corresponds to the splitting
variable.
sample.ps: A postscript file for the final tree that can be viewed or
printed.