Donna Spiegelman, ScD

mediateP

Functions for calculating the point and interval estimates of the natural indirect effect (NIE), total effect (TE), and mediation proportion (MP), based on the product approach.

Faculty: Fan Li, PhD; Donna Spiegelman, ScD

Download: Cran R / mediateP package

Platform: R

Reference: doi.org (mediateP)

%blinplus

The macro %blinplus corrects for measurement error in one or more model covariates logistic regression coeÿcients, their standard errors, and odds ratios and 95% confidence intervals for a biologically meaningful dierence specified by the user (the ”weights”). Regression model parameters from Cox models (PROC PHREG) and linear regression models (PROC REG) can also be corrected. A validation study is required to empirically characterize the measurement error model. Options are given for main study/external validation study designs, and main study/internal validation study designs (Spiegelman, Carrol, Kipnis; 2001). Technical details are given in Rosner et al. (1989), Rosner et al. (1990), and Spiegelman et all (1997).

Faculty: Donna Spiegelman, ScD

Download: %blinplus package

Platform: SAS

Reference: doi.org (%blinplus)

betacomp.f

Software for implementing Spiegelman D, Rosner B. “Estimation and inference for binary data with covariate measurement error and misclassification for main study/validation study designs.” Submitted for publication, J American Statist Assoc, June, 1997.

Faculty: Donna Spiegelman, ScD

Download: betacomp.f package

Platform: Fortran

Reference: doi.org (betacomp.f)

goodwin.f77

goodwin.f77 Implementing Crouch EAC, Spiegelman D. The evaluation of integrals of the form f(t)exp{-t2}dt: Application to logistic-normal models. Journal of the American Statistical Association 1990; 85: 464-469.

Faculty: Donna Spiegelman, ScD

Download: goodwin.f77package

Platform: Fortran

Reference: doi.org (goodwin.f77)

Multsurr method

The SAS Macro %multisurr described in this documentation perform regression calibration for multiple surrogates with one exposure as discussed in the paper by Weller et al (submitted to Biostatistics 2004). This type of data is often encouraged in occupational studies where the measurement of exposure can be quite complex and is characterized by numerous factors of the workplace; therefore, multiple surrogates often describe one exposure.

Faculty: Donna Spiegelman, ScD

Download: Multsurr method package

Platform: SAS; S-PLUS

Reference: doi.org (Multsurr method)

%relibpls8

The macro %relibpls calculates regression coefficient, their standard errors, and odds ratios, when relevant, and 95% confidence intervals for a biologically meaningful difference specified by the user (the “increments”), where all are corrected for measurement error in one or more model covariates. Linear (proc reg), logistic (proc logistic), survival and conditional logistic (proc purge) and mixed (proc mixed) models are implemented. A reliability study is required to empirically characterize the measurement error model. Details are given in Rosner et al. (1989), Rosner et al. (1990), and Rosner et al. (1992), including “real data” examples.

Faculty: Donna Spiegelman, ScD

Download: %relibpls8 package

Platform: SAS

Reference: doi.org (%relibpls8)

%rrc

The macro %rrc uses the risk set regression calibration (RRC) method to correct the point and interval estimate of the relative risk in the Cox proportional hazard regression model for bias due to measurement error in one or more baseline or time-varying exposures, including time-varying variables that are functions of the exposure history such as the 12-month moving average exposure, cumulative average exposure, cumulative total exposure, etc. An external and internal validation study designs are available to use this macro. Technical details are given in Liao et al. (2011) and Liao et al. (2018).

Faculty: Donna Spiegelman, ScD

Download: %rrc package

Platform: SAS

Reference: doi.org (%rrc)

ge.int.f

Software for implementing Foppa I, Spiegelman D. “Power and sample size calculations for case-control studies of gene-environment interactions with a polytomous exposure variable”. American Journal of Epidemiology, 1997; 146:596-604.

Faculty: Donna Spiegelman, ScD

Download: ge.int.f package

Platform: Fortran

Reference: doi.org (ge.int.f)

ge_trend_v2

The program ge_trend_v2 is designed to calculate the power and minimum required sample size for case-control studies testing hypotheses about gene-environment interactions with a polygamous exposure variable. The program extends the original program ge_trend by permitting the investigator the freedom to allow the main effect odds ratio for gene and exposure to vary in a user-specific interval under the alternative hypothesis.

Faculty: Donna Spiegelman, ScD

Download: ge_trend_v2 package

Platform: Fortran

Reference: doi.org (ge_trend_v2)

holcroft.f77

A user-friendly Fortran program is available from the second author, which calculates the optimal sampling fractions for all designs considered and the efficiencies of these designs relative to the optimal hybrid design for any scenario of interest.

Faculty: Donna Spiegelman, ScD

Download: holcroft.f77 package

Platform: Fortran

Reference: doi.org (holcroft.f77)

OPTITXS.r

Calculating the minimum number of participants (N) for a fixed number of measurements (r), given pre-specified power; the minimum number of repeated measurements (r) for fixed N, power, and pre-specified study length or time between visits; power for a given (N,r); and the optimal (N,r) subject to power or cost constraints. Compound symmetry, damped exponential, and random slopes covariances are supported.

Faculty: Donna Spiegelman, ScD

Download: OPTITXS.r package

Platform: Fortran

Reference: doi.org (OPTITXS.r)

%contrasttest

The %contrastTest macro conducts heterogeneity test for comparing the exposure-disease associations obtained from separate subtype-specific analysis based on the cohort or nested case-controlled studies. Specifically, the user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each sub-type, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.

Faculty: Donna Spiegelman, ScD

Download: %contrasttest package

Platform: SAS

Reference: doi.org (%contrasttest)

%meta subtype trend

The %subtype_trend macro tests whether the exposure-subtype association has a trend across the ordinal cancer subtypes. The user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each subtype, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.

Faculty: Donna Spiegelman, ScD

Download: %meta subtype trend package

Platform: SAS

%subtype_MultipleMarker

A meta-regression method that can utilize existing statistical software for mixed model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by one marker while controlling for other markers, and to evaluate whether the difference in exposure-subtype association across subtype defined by one marker depends on any other markers.

Faculty: Donna Spiegelman, ScD

Download: %subtype_MultipleMarker package

Platform: SAS

Reference: doi.org (%subtype_MultipleMarker)

%subtype

A %subtype macro examines whether the effects of the exposure(s) vary by subtype of a disease. It can be applied to data from the cohort studies, nested or matched case-control studies, unmatched case-control studies and case-case studies.

Faculty: Donna Spiegelman, ScD

Download: %subtype package

Platform: SAS

Reference: nih.gov (%subtype)

%metaanal

The %METAANAL macro is a SAS version 9 macro that produces the DerSimonian-Laird estimators for random efects or fixed efects models in pooled or metaanalysis. It can be used to pull results from two or three of the Channing cohorts and test for between-studies heterogeneity.

Faculty: Donna Spiegelman, ScD

Download: %metaanal package

Platform: SAS

Reference: doi.org (%metaanal)

%metadose

%metadose

The %metadose macro is a SAS macro for meta-analysis of linear and nonlinear dose-response relationships. It is used when research reports studying the same dose-response relationship have dierent exposure or treatment levels. It is a two step macro: First, for each study, it uses either the Greenland method (AJE, 1992) or Hamling method (SIM, 2008) to get estimated cell counts of the 2X2 table adjusted for counfounding, then it estimates the asymptotic correlation between the adjusted log odds ratio estimates for each exposure level relative to the referent level, from which we can get the estimated covariance matrix for these study-specific estimates. After this step, we get a single pooled estimate and its variance estimate across dierent exposure or treatment levels. Then, meta-analysis is performed analysis for all the studies using the single study-specific trend estimate, in common units across studies. An option also exists to explore and graph non-linearity in the poooled results.

Faculty: Donna Spiegelman, ScD

Download: %metadose package

Platform: SAS

Reference: doi.org (%metadose)

tcs

The identification of heterogeneity in effects between studies is a key issue in meta-analyses of observational studies, since it is critical for determining whether it is appropriate to pool the individual results into one summary measure. The result of a hypothesis test is often used as the decision criterion. In this paper, the authors use a large simulation study patterned from the key features of five published epidemiologic meta-analyses to investigate the type I error and statistical power of five previously proposed asymptotic homogeneity tests, a parametric bootstrap version of each of the tests, and tau2-bootstrap, a test proposed by the authors. The results show that the asymptotic DerSimonian and Laird Q statistic and the bootstrap versions of the other tests give the correct type I error under the null hypothesis but that all of the tests considered have low statistical power, especially when the number of studies included in the meta-analysis is small (<20). From the point of view of validity, power, and computational ease, the Q statistic is clearly the best choice. The authors found that the performance of all of the tests considered did not depend appreciably upon the value of the pooled odds ratio, both for size and for power. Because tests for heterogeneity will often be underpowered, random effects models can be used routinely, and heterogeneity can be quantified by means of R(I), the proportion of the total variance of the pooled effect measure due to between-study variance, and CV(B), the between-study coefficient of variation.

Faculty: Donna Spiegelman, ScD

Download: tcs package

Platform: Fortran

Reference: doi.org (tcs)

%glmcurv9

The %GLMCURV9 macro uses SAS PROC GENMOD and restricted cubic splines to test whether there is nonlinear relation between a continuous exposure and an outcome variable. The macro can automatically select spline variables for a model. It produces a publication quality graph of the relationship.

Faculty: Donna Spiegelman, ScD

Download: %glmcurv9 package

Platform: SAS

%kmplot9

The %KMPLOT9 macro makes publication-quality Kaplan-Meier curves for a whole sample or for subgroups/strata. If there are subgroups/strata, it does the log-rank test.

Faculty: Donna Spiegelman, ScD

Download: %kmplot9 package

Platform: SAS

%lefttrunc

The %LEFTTRUNC marco makes publication-ready Kaplan-Meier-type curves using left-truncated data for a whole sample or for subgroups/strata.

Faculty: Donna Spiegelman, ScD

Download: %lefttruncs package

Platform: SAS

%lgtphcurv9

The %LGTPHCURV9 macro fits restricted cubic splines to unconditional logistic, pooled logistic, conditional logistic, and proportional hazards regression models to examine non-parametrically the (possibly non-linear) relation between an exposure and the odds ratio (OR) or incidence rate ratio (IRR) of the outcome of interest. It allows for controlling for covariates. It also allows stepwise selection among spline variables. The output is the set of p-values from the likelihood ratio tests for non-linearity, a linear relation, and any relation, as well as a graph of the OR, IRR, predicted cumulative incidence or prevalence, or the predicted incidence rate (IR), with or without its confidence band. The confidence band can be shown as the bounds of the confidence band, or as a ”cloud” (gray area) around the OR/IRR/RR curve. In addition, the macro can display a smoothed histogram of the distribution of the exposure variable in the data being used.

Faculty: Donna Spiegelman, ScD

Download: %lgtphcurv9 package

Platform: SAS

Reference: doi.org (%lgtphcurv9)

%mediate

The %MEDIATE macro calculates the point and interval estimates, as well as a p-value, for the percent mediation of one effect by one or more intermediate variables. The macro is designed for treatment effects estimated as relative risks in Cox regression survival analysis using PROC PHREG and for treatment effects from generalized linear models using PROC GENMOD. When fitting log-binomial models with PROC GENMOD, an option is available to improve model convergence.

Faculty: Donna Spiegelman, ScD

Download: %mediates package

Platform: SAS

Reference: doi.org (%mediate)

%par

“What % of the cases would be prevented if it were possible to eliminate one or more risk factors from a target population?” The %PAR SAS macro is designed to answer questions such as this by estimating the population attributable risk (PAR) and its 95% confidence interval. We calculate the full PAR and partial PAR, as defined below. The variance formulas implemented here apply only to cohort studies. Currently, the confidence intervals are not valid for case-control studies. Please write to us if you have a case-control study. Models with interaction terms are acceptable. Population prevalences can be considered fixed (e.g. for sensitivity analysis), estimated from the same cohort from which the relative risks were estimated, or estimated from a population survey such as NHANES. • FULL PAR: all measured risk factors are considered eliminated. All members of the target population who are exposed switch to the lowest risk category of all measured risk factors. • PARTIAL PAR: One or more risk factors are considered eliminated, while others are allowed to remain unchanged. References: (Bruzzi et al.(1985), Spiegelman, Hertzmark, and Wand (2006)).

Faculty: Donna Spiegelman, ScD

Download: %par package

Platform: SAS

Reference: doi.org (%par)

%relrisk9

The %RELRISK9 macro obtains relative risk estimates using PROC GENMOD with the binomial distribution and the log link. This is particularly useful when the odds ratio is not a good approximation to the rate ratio (e.g., because of high prevalence of the outcome or large relative risks).

Faculty: Donna Spiegelman, ScD

Download: %relrisk9 package

Platform: SAS

Reference: doi.org (%relrisk9)

%robreg9

The %ROBREG9 macro is a SAS version 9 macro that runs robust linear regression models showing both the model-based (assuming normality) and empirical standard errors, for situations where it is reasonable to use PROC REG (i.e. no repeated measures, continuous dependent variable). This macro can also calculate point and interval estimates of eect on the (unitless) percent change scale, which is often more widely interpretable.

Faculty: Donna Spiegelman, ScD

Download: %robreg9 package

Platform: SAS

%table1

The %table1 macro computes indirectly standardized rates, means, or proportions. The results are automatically prepared, by level of a given exposure variable, in a formatted MS Word table. The table is intended for use in publications with minimal additional formatting and/or preparation required. Table1 of many papers is a breakdown of cohort characteristics by exposure categories. In most instances, it is necessary to age-standardize the means or proportions of other potential confounders before displaying them by exposure category.

Faculty: Donna Spiegelman, ScD

Download: %table1 package

Platform: SAS

%yoll

The SAS YOLL macro uses PROC PHREG to compute the time from a specific start time (or age) to an outcome (expected time after the start time to the outcome) or the time from the outcome to a specific time (or age) (expected time lost before the end time). It bootstraps to get the confidence bounds on these values. It computes these values for different levels of an exposure at specified values of the covariates.

Faculty: Donna Spiegelman, ScD

Download: %yoll package

Platform: SAS

%icc9

The %ICC9 macro is a SAS version 9 macro that computes reliability coeÿcients (intraclass correlation coeÿcients) and their 95% confidence intervals. These quantities can be calculated after first adjusting for fixed effects.

Faculty: Donna Spiegelman, ScD

Download: %icc9 package

Platform: SAS

Reference: nih.gov (%icc9)

%makespl

The %MAKESPL macro is a SAS macro that makes restricted cubic spline variables to be used in procedures. It is incorporated in several of the macros that test for non-linearity, but can also be used on its own to create spline variables for covariates (allowing better control for the covariates usually using up fewer degrees of freedom).

Faculty: Donna Spiegelman, ScD

Download: %makespl package

Platform: SAS

Reference: doi.org (%makespl)

%int2way

The %INT2WAY macro is a SAS macro that constructs all the 2-way interactions among a set of variables. It also makes a global macro variable that lists the new variables.

Faculty: Donna Spiegelman, ScD

Download: %int2way package

Platform: SAS

%pctl9

The %PCTL9 macro is intended to make any desired number of quantiles for a list of variables. It can also make quantile indicators and median-score trend variables. A subset of the data can be used to determine the quantile boundaries.

Faculty: Donna Spiegelman, ScD

Download: %pctl9 package

Platform: SAS

crt2power

Provides methods for powering cluster-randomized trials with two co-primary outcomes using five key design techniques. Includes functions for calculating required sample size and statistical power. For more details on methodology, see Li et al. (2020) <doi:10.1111/biom.13212>, Pocock et al. (1987) <doi:10.2307/2531989>, Vickerstaff et al. (2019) <doi:10.1186/s12874-019-0754-4>, and Yang et al. (2022) <doi:10.1111/biom.13692>.

Faculty: Donna Spiegelman, ScD

Download: cran R / crt2power package

Platform: R

Reference: doi.org (crt2power)