# Designing randomized trials to understand treatment effect heterogeneity

November 17, 2020## Information

Biostatistics Seminar - November 17, 2020

Elizabeth Tipton, PhD

Associate Professor, Department of Statistics, Northwestern University

ID5892

To CiteDCA Citation Guide

- 00:00So welcome everyone,
- 00:01it is my great pleasure to introduce
- 00:04our seminar speaker today,
- 00:06Doctor Elizabeth Tipton.
- 00:07She's an associate professor
- 00:09statistics the Co director of the
- 00:12statistics or evidence based policy
- 00:14and practice Center and a faculty
- 00:16fellow in the Institute for Policy
- 00:18Research at Northwestern University.
- 00:20Unducted sentence research focuses on
- 00:23the design and analysis of randomized
- 00:25experiments with a focus on issues for
- 00:28external validity and generalizability,
- 00:30as well as meta analysis with the
- 00:33focus on dependent effect sizes.
- 00:35Um, today she's going to share with us
- 00:37how to design randomized experiments
- 00:39to better understand treatment effects.
- 00:41Head virginity welcome best.
- 00:43The floor is yours.
- 00:44Thank you.
- 00:45Thank you. I'm very excited to be here today.
- 00:48I really wish I hear I wasn't talking
- 00:50about my office slash closet and was
- 00:52actually with you guys in person and
- 00:55this is my first time doing slides where
- 00:58I'm on the slide so it's a little.
- 01:00It's a little strange.
- 01:01I don't know what is the protocol
- 01:04for questions. How do you guys?
- 01:06How do you usually set this up?
- 01:08Do people what's the norm?
- 01:10Do you guys usually up jump in
- 01:12with questions or save them for
- 01:13the end? So I think as you prefer,
- 01:16we can do either way. OK, I'm
- 01:18just I won't be very good at
- 01:20checking the chat, so if there's
- 01:22a question if somebody can just
- 01:23speak up that would be will do that.
- 01:26I'll do that on the chat. OK, thank you.
- 01:28OK so I just want to set out background
- 01:31for what I'm talking about today,
- 01:33which is I'm talking about randomized
- 01:34trials an I realized that in a
- 01:36Biostatistics Department, you guys.
- 01:39The idea that randomized trials are
- 01:41common is probably almost absurdly basic
- 01:43for the world that you operate in,
- 01:45but I do a lot of my statistical
- 01:47work in the areas of education and
- 01:50psychology and kind of in the field
- 01:52experiments world and those areas.
- 01:55Randomized trials have only
- 01:56become common really,
- 01:57I'd say the last 20 years.
- 02:00So almost 20 years ago,
- 02:03the Institute for Education Sciences
- 02:05was founded in the Department of
- 02:08Education in the US government,
- 02:10and that has funded almost 500
- 02:12what are called efficacy and
- 02:15effectiveness trials and education.
- 02:17Previous to that there were
- 02:19very few of these.
- 02:21There's also an increasing number of
- 02:24nudge experiments in social psychology
- 02:27experiments that are occurring in the world.
- 02:31I know that there's a lot of rain in mice
- 02:34trials occurring in developing countries,
- 02:38so this is late in parallel,
- 02:40maybe 2.
- 02:42In public health,
- 02:43they're being randomized trials there,
- 02:45so I'm just sort of pointing out
- 02:46that these are becoming increasingly
- 02:48common for policy decisions,
- 02:50not just individual decisions.
- 02:53But the trials as there as they are
- 02:56designed currently are not necessarily
- 02:58ideal ideal in the sense that they are
- 03:00not as big as we would like them to be.
- 03:03In order to be able to really explore
- 03:06the data well, there often in,
- 03:08you know,
- 03:09sort of somewhat small samples
- 03:11of clusters in the in the kind of
- 03:13education world that I work in it.
- 03:16They're very often just simple to
- 03:18arm designs 5050 treatment control.
- 03:19I much less common to see things
- 03:22like step wedge or smart designs,
- 03:24so those are trickling in, I think.
- 03:28And the goal of these the is often
- 03:30to get into some things like clearing
- 03:32House of some places so that policy
- 03:35making decision makers can use the
- 03:38information from the trials to
- 03:40make decisions.
- 03:41But the problem which is the focus
- 03:43of my talk is that there very
- 03:45often been taking place in samples
- 03:48that are purely of convenience,
- 03:50which makes thinking about generalizability
- 03:52and heterogeneity rather difficult.
- 03:58If the treatment of X very if treatment
- 04:00effects vary across individuals or
- 04:01they vary across clusters in some way,
- 04:04then it's pretty straightforward to
- 04:05see as a group of statisticians here
- 04:07that the average treatment effect
- 04:09you would get in the population.
- 04:11Is probably not exactly the same thing as
- 04:14the average treatment effect in the sample,
- 04:16and that these could be quite different
- 04:18if treatment effects vary a lot aniff
- 04:21depending upon how the sample is selected.
- 04:23So there has been an increasing
- 04:25amount of work in this area.
- 04:27There's a couple of papers I think that
- 04:30are particularly helpful if there's
- 04:32a paper in education where they're
- 04:34looking at bias from non random treats.
- 04:37Non random treatment assignment or
- 04:39they show that the bias of external
- 04:41validity is on the same order as
- 04:44internal validity bias and so to do
- 04:46so they hear they sort of leverage
- 04:49and natural experiment with a
- 04:51randomized trial to look at this.
- 04:52And that's worked by Bell,
- 04:54Olson, Oregon, Stewart.
- 04:55There's also work showing that.
- 04:58In education and the kinds of schools
- 05:00and school districts that take part in
- 05:03randomized trials are different than
- 05:05the populations of various populations.
- 05:08At something like the Institute of
- 05:10Education Sciences might be interested in,
- 05:13so I have a paper out with Jessica Spy,
- 05:16Brooke Ann are students looking
- 05:18at 37 randomized trials and the
- 05:20samples of schools taking part in
- 05:22those studies and comparing them
- 05:24to various populations of schools.
- 05:27In the US.
- 05:28There's also work hidden behind me.
- 05:31By Liz Stewart.
- 05:33San colleagues looking at school districts
- 05:36and a couple of other papers as well,
- 05:40and these find fairly consistent things.
- 05:43For example,
- 05:44that large school districts are
- 05:47overrepresented in research.
- 05:49Relative to the size of districts in the US.
- 05:52There's been also a lot of work
- 05:55in this area of generalizability
- 05:57and post hoc corrections.
- 05:59I started into this work looking at
- 06:02using post stratification as a way
- 06:05of estimating a population average
- 06:07treatment effect from a sample.
- 06:09There's also been work using
- 06:11inverse probability weighting,
- 06:12maximum entropy weighting
- 06:14bounding approaches.
- 06:15There have been some approaches
- 06:17that focus on little,
- 06:18so I'm thinking like.
- 06:22Here's the paper and Stuart San Green.
- 06:24I think that does that,
- 06:25so there's been like a kind of
- 06:27a flurry of method development.
- 06:29I think here in this area of
- 06:31thinking you know,
- 06:32how do I actually estimate this?
- 06:34If I have population data of different forms
- 06:36and I have sample data of different forms,
- 06:39how can I actually estimate a
- 06:41population average treatment effect?
- 06:43But when I first started doing this work,
- 06:46I realized in a series of examples
- 06:47that I was working on that the
- 06:50effectiveness of these methods is
- 06:51often severely limited in practice
- 06:53because of undercoverage and
- 06:55what I mean is that it you can't.
- 06:57If it turns out that your population
- 07:00has there's a part of the population
- 07:02that's just not represented in the trial,
- 07:04there's really not much statistical
- 07:06magic you can do.
- 07:07You can make some assumptions,
- 07:09but you can't really re wait
- 07:11something that doesn't exist, and the.
- 07:14It's it's really a reflection of lack
- 07:16of positive ITI in the study. Yes,
- 07:18exactly thanks. Yeah exactly.
- 07:19And yeah, I'm just using survey
- 07:21sampling language for the same thing.
- 07:23That's right, yeah, and so.
- 07:25And that's the lack of positive ITI
- 07:27often arises because people aren't
- 07:29thinking about what the population is
- 07:31in advanced and so it's very tricky
- 07:33for them after the fact to generalize,
- 07:35because it turns out that maybe
- 07:36this what I as analyst him now
- 07:38trying to think of as the population
- 07:40isn't exactly the population,
- 07:42but it's very hard for people to
- 07:44articulate what the population is,
- 07:46and so spent a lot of time just trying to.
- 07:49You're out what the population actually is,
- 07:51and if that's population is meaningful,
- 07:53as if that's a population that even matters.
- 07:55So I realized I pivoted a bit.
- 07:57I realized that you could do
- 07:58a lot of this with statistics,
- 08:00but you were going to be limited
- 08:02if you didn't design better trials.
- 08:08And so that's allowed me to think.
- 08:10Well, why don't we just start?
- 08:13The beginning and do this
- 08:14do a better job this so why?
- 08:17What have we started at the
- 08:18beginning of our studies by asking
- 08:20what the target population of the
- 08:22intervention was thinking about
- 08:24inclusion and exclusion criteria,
- 08:25I think it helped.
- 08:27This probably matters even more with
- 08:29like comorbidities and you know.
- 08:31I like rolling out people.
- 08:32You're doing a study on depression,
- 08:34but you rule out people with anxiety,
- 08:36and that's like a big problem
- 08:38for the interpretation since they
- 08:39were highly related to each other.
- 08:41This is true in education as well.
- 08:43So like,
- 08:44what are the inclusion exclusion
- 08:45criteria for your trial and how might
- 08:48that affect where you can generalize?
- 08:50And then also thinking about
- 08:51background characteristics and
- 08:52contextual variables that might
- 08:54moderate the intervention's effect
- 08:55and the tricky part here is,
- 08:56there's a little bit of a circularity
- 08:59which I'm going to keep coming
- 09:00back to in what I'm talking about,
- 09:02which is in order to know these,
- 09:05you know,
- 09:05we don't know what these are in advance,
- 09:08and we don't have a lot of
- 09:10knowledge generated to date about
- 09:11what these variables are,
- 09:13because studies have not been designed to
- 09:14estimate or test hypothesis about moderation,
- 09:17and so instead we have to sort of think
- 09:19through what we think might matter.
- 09:21Using,
- 09:22you know,
- 09:22not a great source of knowledge here,
- 09:25but the idea is that you sort of
- 09:27take all of this information and
- 09:29then you use this to create to
- 09:31use sampling methods to actually
- 09:33design recruitment procedures
- 09:35like using stratified sampling.
- 09:36Figuring out if you should
- 09:38you know within Strata,
- 09:39using balanced sampling or random
- 09:41sampling and thinking about sort of
- 09:44ways in which you can increase the
- 09:46coverage so you can have positive
- 09:48ITI for the whole target population,
- 09:50so that when you do you know.
- 09:52When you do need to make adjustments
- 09:54at the end of your trial using
- 09:56these statistical methods,
- 09:57they are in a realm in which
- 09:59they can perform well.
- 10:03This this sort of lad me to thinking about
- 10:05tools for general for generalization,
- 10:07and so I just want to highlight this
- 10:10because I think this is a good strategy
- 10:13for methods people to think about.
- 10:16So I I thought will nobody is going to do
- 10:19what I'm telling them to do if I don't
- 10:21build a tool because the kind of people.
- 10:24Clan randomized trials,
- 10:25at least in my domain,
- 10:27don't often have statisticians ready at
- 10:28the ready to work with them on things,
- 10:31and they are often writing grant
- 10:33proposals before they've got funding,
- 10:34and so it's very possible that they're not
- 10:37going to think about generalization or or
- 10:39have the training or tools to do it so.
- 10:42I got a grant from the Spencer
- 10:44Foundation and then I've had follow up
- 10:46money from the Institute of Education
- 10:49Sciences to build this tool called
- 10:51the Generalize are that uses some
- 10:53basic design principles standalone.
- 10:54It's got.
- 10:55It's very focused on the user
- 10:58experience and it in the background
- 11:01has the Common Core of data which is.
- 11:04An annual census of the public
- 11:06schools in the US so that the data
- 11:08is already been cleaned and set up.
- 11:11We're adding in right now the iPads data,
- 11:13which is higher Ed data in the US and
- 11:16so the idea is somebody could go in
- 11:18and walk through inclusion exclusion
- 11:20criteria identified moderate orsan.
- 11:22It would build you stratified
- 11:24recruitment plan in less than an hour.
- 11:26You could leave with a list of all
- 11:29the schools and start being able
- 11:31to recruit with the.
- 11:35Great, I've had this going since 2015
- 11:37and it was very slow going for awhile.
- 11:39This is sort of. I just realized this
- 11:42year that I could actually extract
- 11:44a lot of user data and So what you
- 11:47can see here is actually it was slow
- 11:49going and I had some early adopters.
- 11:51These are people that would be star users,
- 11:54so many of them are planning
- 11:56randomized trials,
- 11:56but there was actually a very big
- 11:59jump that occur this summer and
- 12:01that's based on this jump.
- 12:02I actually started digging through
- 12:04things and realized that.
- 12:05Institute of Education Sciences that actually
- 12:07enacted requirements for generalizability.
- 12:09In their request for proposals,
- 12:11and so you can see that what
- 12:13I already always speculated,
- 12:15which is that funders really drive change.
- 12:17So once funders said you need to
- 12:20pay attention to generalizability,
- 12:22people actually started paying attention
- 12:24to generalizability in their proposals.
- 12:27OK, so this I just wanted to give
- 12:28you all of this background as a
- 12:30way of explaining sort of my like
- 12:32where I'm coming from in the in
- 12:34the in heterogeneity and how I'm
- 12:36thinking about this.
- 12:37Um?
- 12:37So everything I've talked about
- 12:39is sort of averaging over hedge
- 12:42and 80 when we talk about analyze.
- 12:44Estimate an average treatment
- 12:46effect for a population,
- 12:47assuming that there's variation of
- 12:49effects and we're averaging over those.
- 12:52But to average over those requires
- 12:54that we know something about how
- 12:56treatment affects very and very often,
- 12:58and I would say this is the
- 13:00in general we don't,
- 13:02and the reason that we don't have a
- 13:04great handle on this is because sample
- 13:06size and sample sizes in randomized trials.
- 13:09I've been very focused on
- 13:11the after treatment effect.
- 13:12Moderators have only become more of a focus,
- 13:15at least in education.
- 13:17More recently,
- 13:17and I think that's true in
- 13:20psychology and related areas as well.
- 13:23And they are often more like
- 13:24exploratory analysis at the end,
- 13:26so you end up with these problems
- 13:28where moderador effects don't get
- 13:29replicated and they don't get
- 13:31replicated because there was.
- 13:32You know,
- 13:33who knows how many statistical tests
- 13:35conducted in order to find those moderators.
- 13:38So they're not very stable,
- 13:39and we don't really necessarily
- 13:41understand or their underpowered
- 13:42deeply underpowered like you just
- 13:44have a very homogeneous sample.
- 13:45And so how are you going to find
- 13:47a treatment effect variation if
- 13:49there's not much variation in your
- 13:51sample to start with,
- 13:52so they're often an afterthought,
- 13:54but I what I noticed overtime is
- 13:56that as generalizability has become
- 13:57something people are paying attention to,
- 13:59people are also starting to pay
- 14:01attention to the idea that you
- 14:03could predict treatment effects,
- 14:04or that you could identify subgroup
- 14:06effects and that this might
- 14:08be very useful information.
- 14:10Which led me to start thinking about
- 14:12how you would design trials for this.
- 14:15So what I'm going to,
- 14:16what I'm leading up to is talking
- 14:18about designing trials to think
- 14:20about heterogeneity.
- 14:20So I'm just going to start with
- 14:22like a little
- 14:23bit of a background here.
- 14:25So we're going to assume that you've got.
- 14:27I'm assuming we've got units
- 14:28which are usually here.
- 14:29Let's say students insights
- 14:30which might be schools,
- 14:32and I'm doing a randomized trial,
- 14:33and I've got these potential outcomes.
- 14:36And so we've got both an
- 14:39average and intercept in these,
- 14:41and we've also got some sort of
- 14:43fixed variation that we can explain.
- 14:46And then we have this other parts that
- 14:48are not affected by the treatment.
- 14:51We've got some site level and individual
- 14:54residuals and some idiosyncratic errors.
- 14:56But what we're interested in
- 14:58really is in these these moderate
- 15:00yrs of treatment effects,
- 15:02and so you could say that Delta
- 15:040 is the difference in averages.
- 15:07I'm assuming these are centered variables,
- 15:09so this is nicely the difference in
- 15:12averages and that the vector Delta
- 15:14is the difference between these
- 15:16effects of the treatment and then
- 15:18under treatment and under control.
- 15:26A lot of so as you have to
- 15:27think about interpretability
- 15:28here of what I mean by Delta.
- 15:31By this by these deltas and
- 15:33how to standardize because we
- 15:34wanted to talk about these,
- 15:36they need to have a mean of 0,
- 15:38but also in order to talk
- 15:40about treatment effects.
- 15:41Sort of done in general for
- 15:43developing things like power,
- 15:44we often standardize them so
- 15:45often we have effect sizes for
- 15:47the average treatment effect,
- 15:48their standardized in relation
- 15:49to the variation in the in the
- 15:52sample and the population.
- 15:53And so here I'm going to sort of
- 15:55say we what we need to do is we
- 15:58need to standardize the covariates
- 15:59and we need to standardize the
- 16:02covariates in relation to the
- 16:03population standard deviation.
- 16:04This might not seem like this
- 16:06is like a radical statement,
- 16:08but if you look into the power analysis
- 16:10literature on how to conduct power
- 16:12analysis for moderate are tests,
- 16:14they are typically standardizing in
- 16:15relation to the sample standard deviation,
- 16:17and in doing so,
- 16:18it makes it impossible to see
- 16:20how your sample actually how you
- 16:22choose your sample might matter.
- 16:24Isibaya standardizing by this
- 16:25fixed value by the population,
- 16:27you've identified a population,
- 16:28and now we're standardizing by
- 16:29that population standard deviation.
- 16:31That will make the role that the
- 16:34sample plays here much more clear.
- 16:37OK,
- 16:37so the fact that we randomized
- 16:39to treatment and control allows
- 16:41us to estimate these dealt these
- 16:43Spectre Delta using some generalized
- 16:46least squares of some sort,
- 16:48and I'm being a little big here
- 16:49because I'm trying to encapsulate
- 16:51cluster randomized randomized
- 16:53block individual randomizer,
- 16:54all like versions of this.
- 16:56OK, so I can do so, I can separate.
- 17:00These are at additive or rather subtractive.
- 17:02I guess the treatment and
- 17:04the control their step.
- 17:06You can separate them.
- 17:07And and through this I can
- 17:10think about statistical power,
- 17:12and for each of these moderador effects.
- 17:16And so,
- 17:16one way you can do that is through the
- 17:18minimum detectable effect size difference.
- 17:20I don't know how common this
- 17:22is used in the sort of.
- 17:23Biostats world,
- 17:24but it's a pretty common metric that's
- 17:27used in cluster randomized trials in.
- 17:30The world I work in,
- 17:32and so it's nice because it's
- 17:34sort of easily interpretable,
- 17:36so this is the smallest affect size
- 17:39that you could for a for a given
- 17:42Alpha level which is affecting this.
- 17:44Msub knew this is.
- 17:47That's like the critical value.
- 17:49This is sort of the smallest
- 17:51true effect that you could detect
- 17:53with the power that you with like
- 17:5580% power for example.
- 17:58And so this is like a general form for this,
- 18:01and So what I'm showing is that its function,
- 18:04can I like move my hands?
- 18:06I don't know.
- 18:07I'm just going to involve a
- 18:09lot of never mind,
- 18:10so it's a function of the variation
- 18:13in the population in that covariate.
- 18:15It's also a function of S,
- 18:17which is you could think of as
- 18:19the sort of covariance matrix
- 18:20of the X is in the sample,
- 18:23so those are different.
- 18:24And then it's a function of N,
- 18:26which is the sample size per cluster.
- 18:29I'm assuming it's constant here.
- 18:30J is the number of clusters and P
- 18:33is the proportion in treatment. So
- 18:36that what is Sigma XK squared? Is?
- 18:38The population SD of effect modifier or
- 18:40the population variance effect modifier?
- 18:42Is the population variance
- 18:44of the effect moderate or modifier?
- 18:46But then your square rooting it so
- 18:48it's going to be gradual scale.
- 18:53OK, so just to give you a couple
- 18:55of special cases where you can
- 18:57sort of parse out some things.
- 18:59So there's been previous work.
- 19:01I meant to include a citation here by
- 19:04Jessica Spy, Brooke and colleagues.
- 19:05That's looking at power
- 19:06for moderate are tests.
- 19:08And so here's 2 cases we have.
- 19:10Site lab, site level,
- 19:12moderate yrs and individual
- 19:14level moderate yrs and.
- 19:15I'm I'm taking basically what they've got,
- 19:17but re tweaking part of it.
- 19:21Because I'm factoring out this Sigma
- 19:24squared and noting that you can
- 19:26actually pull out this thing called
- 19:28RXK at the front and the RXK is
- 19:30this ratio of the standard deviation
- 19:32of the covariate in the sample
- 19:34compared to the standard deviation
- 19:36of the covariate in the population.
- 19:38And So what you can see here is by
- 19:40doing that you by rewriting it.
- 19:43This way you can see that our XK is
- 19:45having just as much of an effect on
- 19:48statistical power as things like the
- 19:50square root of N or the square root of P.
- 19:54These other parameters that most power
- 19:56analysis has spent has focused on,
- 19:57and that's true.
- 19:58You know, in any of these designs.
- 20:00Love seeing it in any of these designs.
- 20:03RX shows up. OK,
- 20:06so if RX is something that matters for power,
- 20:10a question will be well.
- 20:12What are people doing in
- 20:14practice right now right?
- 20:16So maybe maybe people are choosing
- 20:19fairly heterogeneous samples,
- 20:20and So what I've got here is 19.
- 20:23This is 19 randomized trials
- 20:26in education that we extracted
- 20:28information from and we've got.
- 20:31So these are box plots of
- 20:33values across each of these 19,
- 20:35and for each of them I've calculated for
- 20:38holding like the US population of school.
- 20:40So this is like the US population
- 20:43of let's say elementary schools.
- 20:45I'm looking at the ratio of this
- 20:47moderate are in the sample in these
- 20:49studies to the ratio of that to that
- 20:52standard deviation in the population OK,
- 20:55and then I'm looking at boxplots of
- 20:57this and what you can see like do this,
- 21:00don't?
- 21:01OK, what you can see here is that
- 21:03the bar at the bottom.
- 21:05Can you see my cursor?
- 21:08Can't tell if you guys can see my curse.
- 21:10No,
- 21:11you can't see my cursor.
- 21:12OK,
- 21:12so the bar at the bottom there's an R
- 21:14X = sqrt 1/2 and then there's a line
- 21:17across the top that's like a dashed one.
- 21:19That's the R X = sqrt 2.
- 21:22OK,
- 21:22and so you can see that most
- 21:24studies are actually below there,
- 21:26less heterogeneous than the population there.
- 21:28Below this line for one,
- 21:30and they're actually far less heterogeneous
- 21:32than the than the population there are.
- 21:35Actually.
- 21:35If you look at these median values,
- 21:38many of them are closed 2.5,
- 21:40so they are about 1/4 of the variation
- 21:43as we're seeing in the population.
- 21:45So this gives you a sense that if.
- 21:49That there's, uh,
- 21:50an opportunity to improve, right?
- 21:52Like I could increase power not just by
- 21:55increasing my sample size or increasing.
- 21:58My sample size in schools or
- 21:59my sample size of the number of
- 22:01schools which are pretty expensive,
- 22:03but I could also increase my power by
- 22:05changing the kinds of samples that I select.
- 22:10And so that's where these numbers came from.
- 22:12They should have gone to
- 22:14slightly different order.
- 22:15So the main point is that design
- 22:17sensitivity, the way we think,
- 22:19whether that statistical power or
- 22:21standard errors or whatever framework
- 22:22that there this is proportional in
- 22:25some way to this RX value that we can
- 22:27improve our design sensitivity by
- 22:29choosing a more heterogeneous sample.
- 22:32And so funny, I must have like put
- 22:34this in here twice on accident.
- 22:36So this is the same thing but
- 22:38with a line through it.
- 22:40OK so if once you have that insight
- 22:42that heterogeneity matters,
- 22:43that it's actually something that
- 22:45we can include in our power analysis
- 22:47and that is something that is not
- 22:49actually happening in practice.
- 22:50Then we can start thinking about how
- 22:52we might plan studies differently.
- 22:54OK, so if.
- 22:56So how can we improve statistical power?
- 22:58Well,
- 22:58a lot of the literature as I was saying,
- 23:01is focused on improving power by
- 23:03increasing sample size or instead.
- 23:04But what I'm arguing here is that you
- 23:07could increase instead this ratio.
- 23:08You could increase the variation
- 23:10in your sample choosing more
- 23:11heterogeneous sample annual have
- 23:12more statistical power for test
- 23:14of heterogeneity of moderators,
- 23:15and So what would you do with this?
- 23:18It would mean you know purposefully
- 23:19choosing sites that were more extreme,
- 23:21it might end,
- 23:22and that's easy enough to do in one variable.
- 23:25And I'm going to talk a little bit about
- 23:28how to do that with multiple variables.
- 23:31So with a simple,
- 23:32let's just say we had one single continuous.
- 23:35Moderate are like this is a normal
- 23:37distance normally distributed.
- 23:38This theory would tell us that we
- 23:41should choose half of our sample.
- 23:42We would choose half of our sample
- 23:45from the upper from the upper an
- 23:47lower tails and choosing them from
- 23:49the upper and lower tails were
- 23:51actually getting an RX of sqrt 2.
- 23:53This is actually a rather large,
- 23:55so this is going to create a much
- 23:57more homogeneous heterogeneous sample,
- 23:59thus increasing our statistical power
- 24:01because it's more heterogeneous than the.
- 24:03In the population.
- 24:07Similarly, if we had two
- 24:09correlated normal variables,
- 24:10when we this is, you know,
- 24:12we could imagine getting the corners of this.
- 24:14These are all principles, by the way,
- 24:17straight up from experimental design.
- 24:18If you think if you think about it,
- 24:21there are principles from like 2.
- 24:23You know two factor studies or
- 24:25multi factor studies where you're
- 24:26manipulating and instead I'm just saying
- 24:29instead of manipulating these factors
- 24:31were now measuring these factors.
- 24:32Someplace you could choose them
- 24:34to be extreme design points.
- 24:36It gets a little harder once
- 24:37things become correlated,
- 24:38so when they become correlated,
- 24:40I don't have as much sample available
- 24:42to me because there's just fewer
- 24:44population units in those corners,
- 24:46and so it's going to become
- 24:47increasingly hard as I add variables,
- 24:49it might become harder and harder in
- 24:52order to figure out what these units
- 24:54are that I could be sampling from.
- 25:02So I started thinking about
- 25:04how you would do this,
- 25:06and I realized that there is actually
- 25:08a literature on this in in the world
- 25:11of sort of industrial experiments
- 25:12and industrial experiments,
- 25:14and in psychology people again are
- 25:16thinking about multi factor studies.
- 25:18So they're thinking about things you could
- 25:21better in the experimenters control.
- 25:25But we could instead bout sampling in
- 25:27the same as the same kind of thing.
- 25:30Except that we don't have control
- 25:32over manipulating them.
- 25:33We can find these units and as as
- 25:35an alternative approach,
- 25:36so one of the things we want to
- 25:38do is we want to make sure that
- 25:40we observe the full range of
- 25:43covariate values in the population,
- 25:45so it requires us to actually think,
- 25:47you know,
- 25:47explore the population data and
- 25:49make sure that we can understand
- 25:51what that range of values is.
- 25:53We might need to think carefully about
- 25:55moderators that are highly correlated.
- 25:57It can be very hard to D alias these effects,
- 26:00so if you have two highly correlated
- 26:02moderators. I think about that.
- 26:04I have two highly correlated
- 26:06moderators like this.
- 26:06If I want to estimate and
- 26:08understand moderators of X,
- 26:10if I want to explore X&Z and
- 26:11these are highly correlated,
- 26:13I'm going to really need to make sure
- 26:15I have those off diagonals that are
- 26:17kind of more rare in order to help me
- 26:20separate these effects an understand
- 26:22the unique contribution of each.
- 26:23The other is that if we might
- 26:25have many potential moderators
- 26:27that we're interested in,
- 26:28and so we're going to have to
- 26:30anticipate this in advance and think
- 26:32carefully about sort of compromises,
- 26:34we might need to make here.
- 26:36But also think very carefully,
- 26:37like we're not going to be able to expand
- 26:40this study to have a much bigger sample.
- 26:42So a lot of what I'm trying to
- 26:44operate under the constraint here is,
- 26:46let's not change the sample size
- 26:47if we don't change the sample size,
- 26:49but we instead change the height
- 26:51types of units in our study,
- 26:52how much better can we do?
- 26:55OK,
- 26:56so this leads to a principle found
- 26:58in response surface models called D
- 27:00optimality and so AD optimal design.
- 27:03This is work from the 40s and 60 Forties,
- 27:0750s and 60s.
- 27:08A lot of work here by Walt Kiefer,
- 27:11and a lot of people in
- 27:13industrial experiments.
- 27:14The idea is that you can instead focus
- 27:16on the generalized variance an you want
- 27:19to minimize the generalized variance,
- 27:22which is the determinant.
- 27:24So D is for determinant.
- 27:26And so the design that meets
- 27:27this criteria is one that also
- 27:29conveniently minimizes the maximum
- 27:31variance of any predicted outcome
- 27:33based upon these covariates.
- 27:34So this is great if what you're
- 27:36headed for is trying to make predict
- 27:39individual treatment effects or
- 27:40site specific treatment effects.
- 27:43The nice thing about a method
- 27:44that's been around for a while
- 27:46is that there's been algorithms
- 27:48developed for doing this.
- 27:49Better out Federov win algorithm
- 27:51is widely used and variations
- 27:53of it and that these are package that
- 27:55there are like statistics package
- 27:57already available that do this.
- 27:59So in our there's something called
- 28:01the ALG design package that is set
- 28:04up to actually work through this.
- 28:06So designs that we know are optimal.
- 28:08In other contexts.
- 28:09You know like our designs like Latin squares,
- 28:12designs etc all become special cases of this.
- 28:15So this is a much more general framework
- 28:19that doesn't require as many assumptions.
- 28:22OK, so once you start down this
- 28:24path you realize too that there are
- 28:27some tradeoffs here, so we have.
- 28:29You can easily imagine that the design that
- 28:32is optimal for an average treatment effect,
- 28:35which might be a representative sample.
- 28:37That sort of like a miniature of the
- 28:39population on covariates is likely not
- 28:42optimal for some of these standardized
- 28:44effect size differences where we might
- 28:46need to oversample in order to estimate,
- 28:49estimate, estimate these,
- 28:50and so there's another.
- 28:52Benefit of this approach,
- 28:53which is that you can focus on
- 28:56augmentation approach and what that
- 28:58means is you can actually say using
- 29:00these algorithms better billable 30
- 29:02sites or already for I've already got
- 29:0530 design run so the language of this
- 29:08is these sites become designed runs.
- 29:11And I need to select 10 more.
- 29:15Meaning population units, what?
- 29:1710 units can I augment it with that will
- 29:20improve that will make this as D optimal
- 29:23as possible given these constraints,
- 29:26and so instead so we're thinking
- 29:29of population units as possible
- 29:31design runs and sample as design
- 29:34runs that we've chosen to use.
- 29:36OK, so I'm just going to go through
- 29:40an example to talk about this.
- 29:45Don't have a ton more slides
- 29:47I should say so success.
- 29:49OK, so here's an example.
- 29:51The success for all evaluation was
- 29:53an elementary school reading program
- 29:55evaluated between 2001 and 2003.
- 29:57The reason I like to use this example.
- 30:00Is that it's old enough that strangely,
- 30:02they actually published in their
- 30:04paper a list of schools they
- 30:06actually named the schools in their
- 30:08study and characteristics of them.
- 30:09I have other data on other studies
- 30:11where people have shared with me
- 30:13the names of the schools involved,
- 30:15but it's all like I have to keep it
- 30:17secret for the for IR be reason,
- 30:20so that the fact that this is
- 30:22available makes it easier to use.
- 30:23So what I did is I went back
- 30:25and looked at the Common Core of
- 30:28data I identified based upon the
- 30:30study that they were.
- 30:31In the way that they
- 30:32talked about their study,
- 30:34that Title One elementary schools
- 30:35in the US at that time might be a
- 30:38reasonable population to think that
- 30:40they were trying to sample for.
- 30:42Title one schools have at least 40%
- 30:45students on free or reduced lunch and
- 30:46meet a few other characteristics and
- 30:48then they identified in the paper 5
- 30:51variables that they thought were possible.
- 30:53Moderators that would be really
- 30:55important to include here,
- 30:56so they talked about total school
- 30:58enrollment being a factor,
- 30:59racial and ethnic composition
- 31:01of the students.
- 31:02So I'm using that here as the
- 31:04proportion of students that are black
- 31:06and the proportion that are Hispanic
- 31:08and SES meaning and a professor but
- 31:10proportion at free and reduced lunch.
- 31:12And they also talk about Urbanicity
- 31:14because they tried to make sure they
- 31:16had some urban schools in rural
- 31:18schools and some other schools.
- 31:19So I should say in previous work of
- 31:21mine I've used this as an example
- 31:23and then this study actually ends
- 31:25up being a fairly representative
- 31:27sample of the population,
- 31:28which is interesting.
- 31:29Is it because they had no real way
- 31:31of they weren't doing it totally in
- 31:33a way that allowed them to compare
- 31:35this or to choose this in a way,
- 31:37but they did a lot of work to try
- 31:40to be representative,
- 31:41and this is much more representative
- 31:43sample then.
- 31:44I take the modal study is in this domain.
- 31:47OK,
- 31:47So what I did for for this example
- 31:50as I'm comparing for you the actual
- 31:53sample that they selected,
- 31:55so it's always these five moderators.
- 31:57The actual sample selected a
- 31:59representative sample selected.
- 32:00If I instead I use something like
- 32:03stratified random sampling.
- 32:04The optimal sample based upon these
- 32:06five covariates using this ALG
- 32:08design package and then various
- 32:10augmentation allocations.
- 32:11And So what I would do here as I'd say.
- 32:16So if I if I took 41, you know.
- 32:19So if I used 36 sites that were
- 32:21selected with random sampling with
- 32:23stratified random sampling and
- 32:25then I reserved five of them that
- 32:27were selected using D optimality
- 32:28and then I would change, you know,
- 32:30the number of those.
- 32:32So you could see this sort of effect.
- 32:34You know that augmentation would
- 32:36have and then for each of these I
- 32:38calculated a few different statistics.
- 32:40So you can see how this works.
- 32:42So one of them is D.
- 32:44This measure of the optimality.
- 32:46And I'm going to show you relative
- 32:48measures because it's a little easier
- 32:51to see with relative measures.
- 32:52I'm also including B,
- 32:54which is generalizability
- 32:55index that I developed.
- 32:57It ranges from zero to one and one means
- 32:59that the sample isn't exact miniature
- 33:02of the population on these covariates.
- 33:040 means they like are completely
- 33:07orthogonal to each other and.
- 33:10Chip in its the index is highly related
- 33:14to measures of undercoverage and how and
- 33:18the performance of reweighting methods.
- 33:21And then the mean are meaning the ratio
- 33:24between the the ratio between the
- 33:27standard deviation in the sample and
- 33:30population across these five covariates.
- 33:34OK, so this is what we get out of this,
- 33:36and so I just want to talk through this and
- 33:38I'm happy to answer questions if there's.
- 33:40I know there's a lot going on here.
- 33:43Really wish I could figure out how to do a.
- 33:47Pointer.
- 33:49I don't think I can point out that way.
- 33:52OK, so OK.
- 33:52So what I have going on here is the number
- 33:55of sites randomly selected is left to right?
- 33:58So on the left is the
- 33:59is the D optimal sample,
- 34:01meaning the whole all 41 sites
- 34:03were actually selected using a
- 34:05D optimal algorithm on the.
- 34:07Right is the ideal for the
- 34:09average treatment effect.
- 34:10We've used random sampling to stratified
- 34:13random sampling and just like the
- 34:15the sample an in the bar right right
- 34:17there that like right up there.
- 34:19This Gray vertical bar is the actual
- 34:22study values for each of these.
- 34:24OK so you can see the actual study
- 34:26and then what I've got are three
- 34:29different lines going on here.
- 34:31So one line that's sloping down in
- 34:33solid is the relative D optimality value,
- 34:36so this is.
- 34:37You know the highest value is if
- 34:40it was a D optimal allocation.
- 34:42This is a ratio,
- 34:44and then I've got the B index,
- 34:46which is the generalizability index.
- 34:48Is the other solid line going up,
- 34:50and so, not surprisingly,
- 34:52that's increasing as we get
- 34:53to stratified sampling,
- 34:54so these are going in opposition
- 34:57to each other.
- 34:58Is what I'm saying and then this
- 35:00relative average standard deviation.
- 35:01Is this dotted bar line?
- 35:03So what so the main message of
- 35:05this is that these are going in
- 35:08opposite directions right that?
- 35:09The the sample that is optimal for the
- 35:12average treatment effect is on the right.
- 35:14The sample that is optimal for
- 35:16moderate are effects is on the left,
- 35:18and so there's there's tradeoffs
- 35:20involved in these that what's best
- 35:23for one is not best for the other.
- 35:25But there's other lessons in here,
- 35:27wow, so the the B index is,
- 35:29which is a measure of similarity
- 35:31between the sample and population,
- 35:32is actually not that bad
- 35:34for the optimal sample.
- 35:35So these these the sample is
- 35:36different from the population.
- 35:38You'd have to do some re waiting,
- 35:40but it wouldn't be a tremendous
- 35:41amount of re waiting to be able to
- 35:44estimate the average treatment effect.
- 35:45And so one lesson that you
- 35:47could think of it from.
- 35:48This is if you actually if we designed
- 35:50randomized trials to test moderators,
- 35:52we'd actually be in a pretty
- 35:54good space to test moderators.
- 35:55And to estimate the average treatment effect,
- 35:57it wouldn't be that far off.
- 35:58It wouldn't be.
- 36:00It wouldn't be terrible,
- 36:01and that makes sense because we're
- 36:03covering so much of the population
- 36:06by getting her across a bunch of
- 36:09moderators that we can do so that
- 36:11we can re wait when in a domain in
- 36:13which there's no act extrapolations,
- 36:15we have positive ITI we can re wait next.
- 36:18Another sort of I think finding here is
- 36:21if we look over at the right hand side.
- 36:24If we do, you know the trade off is.
- 36:27If I do select for the average
- 36:30treatment effect.
- 36:31I do get a tremendously,
- 36:32you know I can select for the average
- 36:34treatment effect and do pretty
- 36:36well for the average human effect,
- 36:37but not do so well for that.
- 36:39For the moderators,
- 36:40and so what's ideal for average
- 36:42is definitely not deal for
- 36:43the moderate are tests.
- 36:44So and then the third thing would
- 36:46be if you look at the actual study.
- 36:48As I was saying,
- 36:49they actually did a pretty good job
- 36:51in terms of representativeness.
- 36:52You can see that that top dot,
- 36:54but if you look at the bottom
- 36:56at the other two dots you can
- 36:59see they didn't do so well for.
- 37:01Being able to test these these moderators.
- 37:07OK, so in case that was not intuitive
- 37:09another way you could look at this
- 37:11is to actually just look at what
- 37:14these samples these these features
- 37:16of these samples would look like.
- 37:18So in the top the top row here
- 37:20are population distributions.
- 37:22Of these five covariates that
- 37:24were sort of identified,
- 37:25and then at the bottom row is
- 37:27actually the study that they had.
- 37:29So what their actual sample looked like.
- 37:33And then the middle is what AD
- 37:35optimal sample would look like.
- 37:36And then I've overlaid on here.
- 37:38These are values,
- 37:39so giving you a sense if R is greater is 1.
- 37:43It means the sample is like the same
- 37:46standard deviation as in the population.
- 37:48If R is greater than one,
- 37:50it means I've got more heterogeneity
- 37:52in my sample than in my population,
- 37:54which improves my ability to
- 37:55estimate moderate are effects.
- 37:57And So what you see are a few things.
- 38:00One is in that the optimal sample is.
- 38:02It pushes things towards the extremes,
- 38:05right?
- 38:05It's pushing them towards the
- 38:07extremes to get endpoints which we
- 38:09know from basic experimental design,
- 38:11improved abilities.
- 38:12The other nice thing though,
- 38:14is a concern always when you're
- 38:16doing experimental design like this
- 38:18is that you're going to get your
- 38:20highly focused on like a linearity
- 38:22assumption that you're going to your.
- 38:25Your ideal sample would have a
- 38:27strong linearity assumption to it,
- 38:29but because you have multiple variables an
- 38:31because not all design runs are possible.
- 38:34In the population,
- 38:35you end up with these middle points
- 38:37as well so you don't end up with
- 38:39only things on both extremes.
- 38:41You end up with some middle points
- 38:43which allow you to be able to estimate
- 38:47nonlinear relationships as well.
- 38:48Me and a Third Point with me.
- 38:50You can see that you would just end
- 38:52up with a lot more variation and so
- 38:54not surprisingly, total students,
- 38:56which, again schools studies,
- 38:57tend to over represent very large
- 38:58schools and large school districts.
- 39:00You can see this is a place where
- 39:02there would be really a real
- 39:04opportunity for a change that in
- 39:05the sample this was less than one
- 39:07an in the in the optimal sample
- 39:09it would be greater than three.
- 39:11But you can see this for most of
- 39:14these variables that you could.
- 39:16You could potentially improve your
- 39:17power and ability to estimate things
- 39:19related to demographics as well.
- 39:21And in my paper I actually show that
- 39:23because many of these are proportions,
- 39:26you can actually also think about
- 39:27student level moderate yrs because
- 39:29proportions conveniently like the
- 39:31variation in proportions at the
- 39:33individual level as a function of
- 39:34the proportion at the aggregate.
- 39:36And so you can actually kind of workout
- 39:39a way to select your samples so that you can.
- 39:42Estimate individual affects,
- 39:44not just cluster aggregates
- 39:47for those variables.
- 39:50OK, and so then the final point.
- 39:52I just want to make is that the
- 39:54other thing that this shows is that
- 39:57there's real benefit to augmentation
- 39:59so. Maybe? You know,
- 40:00maybe I'm not going to be able to
- 40:03convince people to go switch to selecting
- 40:05their samples based upon extremes.
- 40:07But maybe you can convince people
- 40:10that they could preserve 5 or 10.
- 40:12You know 10% or 25% of their
- 40:15sample for D optimality.
- 40:16So you choose.
- 40:17In this case it would be like choose
- 40:1930 of your sites using stratified
- 40:22sampling to represent the population,
- 40:24and then look for like an additional
- 40:27class tenor 11 sites that might be
- 40:29more extreme that allow you to make
- 40:32sure that you can estimate these.
- 40:34These moderate are effects
- 40:36that you're interested in.
- 40:37And you can see that doing so key
- 40:39file with these little lines you can
- 40:41see that doing so doesn't have a huge
- 40:43effect on the average treatment effect,
- 40:44but it does greatly improve
- 40:46your ability to test moderators.
- 40:50OK, so just to wrap up my take home
- 40:53points today, I suppose would be that
- 40:56the design of randomized trials has big
- 40:59implications for ability to generalize.
- 41:02And that I think we, I think what I've
- 41:04seen over time is that people who are
- 41:07starting to pay attention to that,
- 41:08and they're starting to think
- 41:10about how populations you know.
- 41:11What are the populations I would
- 41:13add as a side benefit of this is
- 41:15I've I've watched as people in
- 41:17asking people to scientists to think
- 41:19about what the population is.
- 41:20It actually sometimes make some change
- 41:22with the intervention is because you kind
- 41:24of have to realize like is this is this.
- 41:27If this is the population,
- 41:28is this the right intervention?
- 41:31The second sort of point I would say,
- 41:34is that if we want to sort of estimate
- 41:36and test hypothesis and moderators
- 41:38that we would be wise to actually
- 41:41plan to do so and to think about how
- 41:44to have better design sensitivity
- 41:45and statistical statistical power for
- 41:47doing so instead of waiting until
- 41:49the end and then the last point is
- 41:51just that this augmentation approach
- 41:53indicates that we don't have to
- 41:55be perfect at this like that,
- 41:57we could just, you know,
- 41:58use do this for part of our sample.
- 42:01And we would be better off and then
- 42:04I guess I would say maybe my general
- 42:06philosophy in all of this design is that.
- 42:08What I'm trying to do is to get people
- 42:10to think differently and plan differently,
- 42:13and by doing so,
- 42:14even if you don't succeed 100%,
- 42:15you're better off than you would
- 42:17have been before,
- 42:18and you're now able to be in the
- 42:20realm in which you have positive
- 42:22ITI and heterogeneity,
- 42:23and you're able to actually
- 42:25use statistical methods.
- 42:26To get better estimators at the end.
- 42:31Thank you, this is all my contact
- 42:33information and this is the paper
- 42:36that this talk is really about.
- 42:38I'm happy to answer questions.
- 42:41Thanks so much. Best,
- 42:43I think that's really nice talk and
- 42:45thank you for being so inspiring.
- 42:47And maybe let's open to questions
- 42:491st to see if we have any
- 42:52questions from the audience.
- 42:55If so, please speak up or, you know,
- 42:58send a chat. Either one is OK.
- 43:07And if not, I can go first.
- 43:09'cause I do have a couple of questions.
- 43:13So, so first of all, I think you
- 43:15know there is a constant tension.
- 43:17Of course, like you know when we work with
- 43:20really large trials in the healthcare system,
- 43:23I think there is a tension between how do we
- 43:26better represent the population of interest?
- 43:29Because we want to get effectiveness
- 43:31information 'cause we're
- 43:32spending millions of dollars.
- 43:33But also I think there is a concern
- 43:36on you know how to really better
- 43:38engage these large clusters,
- 43:40large healthcare systems or large clinics,
- 43:42etc. And so I think.
- 43:44People end up getting convenience
- 43:46samples because that's reality.
- 43:47Even though I do believe that there's
- 43:49so much more to improve because
- 43:51they're spending so much money right.
- 43:53And then in the end,
- 43:55you know they may be answering a
- 43:57different question if they have a
- 43:59very highly selected sample and then
- 44:01people also worry about you know,
- 44:03like you know there are some
- 44:05disparities in their sample selection,
- 44:06so that you're basically not covering
- 44:08you know people with maybe more
- 44:10vulnerable conditions etc in your study,
- 44:12but you wish to answer questions.
- 44:14What is population?
- 44:15So I feel like all of this very,
- 44:18very relevant, at least to my work.
- 44:21And so I really appreciate you know this
- 44:24aspect of how to design styles better.
- 44:271 one of the questions I have
- 44:29is that generally,
- 44:30you know we may not really know priority
- 44:33what the effect modifiers are in planning.
- 44:36The trial that we may have
- 44:38not enough knowledge amount.
- 44:40So how does that generally come into
- 44:43the discussion in the design stage?
- 44:45Is it the tradition that in educational
- 44:48studies we have a lot of prime knowledge
- 44:51on what these effect modifiers are
- 44:53or? No, so I think this is actually
- 44:55one of the hardest parts, right?
- 44:57Like I just laid out.
- 44:58Sort of, if we knew what the
- 45:00why zeros and Y ones were,
- 45:01this is what we would.
- 45:02You know this is that would be optimal,
- 45:04but I could be wrong on
- 45:06what those are, right?
- 45:09And I don't know.
- 45:10I mean, I think so.
- 45:12There's sort of what I call the
- 45:14usual suspects in education,
- 45:15which are like race class and gender,
- 45:18which are really more of concerns
- 45:19about disparity or about closing
- 45:21achievement gaps in various ways.
- 45:23And so those in depth and
- 45:24urbanicity I would add seems to be
- 45:27something that people often like.
- 45:28What add into that as characteristics.
- 45:30Those are the ones that
- 45:32people most often use.
- 45:33But the and those are
- 45:35available in population data,
- 45:36which is the other thing
- 45:38that your limit your.
- 45:39A real limiter is what is available
- 45:41in the population, sure.
- 45:43What I gather is more likely to be a moderate
- 45:46are or something like baseline achievement,
- 45:49right?
- 45:49So if my outcome is achievement then I would,
- 45:52I would think that what the
- 45:55achievement is baseline in any
- 45:56of these places would matter.
- 45:58That's harder to get an education.
- 46:00I mean that information from places,
- 46:02so there's been some work trying to
- 46:05equate tests across across states.
- 46:06I guess that they do.
- 46:08Sometimes
- 46:08they use gain scores just to subtract off
- 46:11that baseline achievement, right? They
- 46:13do. Yeah, exactly,
- 46:14but the problem is that like if you
- 46:16wanted to use state tests or something,
- 46:19there are different tests in every state,
- 46:21and so there's all of these
- 46:23equating issues that go in with it.
- 46:25My guess is that implementation is another
- 46:27one that people often come up with is
- 46:29like something with implementation.
- 46:31Now this is tricky because implementation
- 46:33is coming after assignment and
- 46:34so it's really like a mediator.
- 46:36But if you think about often,
- 46:38if you think implementation
- 46:39may be part of what is leading
- 46:41to treatment effect variation,
- 46:42then you can kind of think well what.
- 46:45Affects implementation and so
- 46:46people can sometimes think a little
- 46:48more carefully about what affects
- 46:50implementation like Oh well,
- 46:51it's probably.
- 46:52You know, it's probably easier to implement
- 46:54this in schools that are like this.
- 46:57Then schools that are not like that.
- 47:01You might try to find various
- 47:03measures of this for the
- 47:04implementation that sounds more like a.
- 47:08It's sort of a version of multiple
- 47:10treatments, and it's a violation of
- 47:11the suitable condition, probably.
- 47:12Yeah, yeah, exactly yeah.
- 47:13So I mean so it it gets.
- 47:15It gets tenuous. Yeah, I don't.
- 47:17I don't have this is, you know,
- 47:19this is like I when I first started
- 47:21doing this work I was like,
- 47:23well assuming moderate yrs and assume a
- 47:25population moving on as a statistician.
- 47:26But actually those are the two
- 47:28hardest things when working with
- 47:29people in planning these trials
- 47:31is thinking about what they are.
- 47:32I'll give you an example though.
- 47:34Uh, like a positive case which was.
- 47:37I was part of designing something called
- 47:39the National Study of learning mindsets,
- 47:41which is we randomly sampled
- 47:43100 high schools in the US,
- 47:45and then we randomly and then the students.
- 47:48There were.
- 47:49Ninth graders were in the study
- 47:51and so 9th graders were randomly
- 47:53assigned to either using a computer
- 47:55based intervention to a growth
- 47:57mindset intervention or something
- 47:59that was not growth mindset that was
- 48:02just sort of control condition and.
- 48:05And in doing that we had the social
- 48:07psychologist I was working with had
- 48:09a lot of questions like we had a
- 48:11lot of hard questions about these
- 48:13moderators and they had a lot of
- 48:15theories about what they might be like.
- 48:17So we oversampled like we.
- 48:19Looked at for example,
- 48:22proportion of students that are
- 48:25minorities in the school and then.
- 48:28And when we started we wanted to
- 48:30stratify on that as well as school
- 48:32at a measure of sort of school
- 48:34achievement as well,
- 48:35and so we needed to be able to
- 48:38cross these in a way in order to.
- 48:41In order to D alias these trends and so
- 48:43that they could estimate one without,
- 48:46you know,
- 48:46without estimating with separated
- 48:48from the other. So a lot of it.
- 48:50So I mean,
- 48:51in some places people are much more better,
- 48:53much better theoretically.
- 48:54Thinking about this,
- 48:55I think some fields are better at
- 48:57thinking about these mechanisms than
- 48:59other fields are, but yeah, it's really hard.
- 49:02It's really hard.
- 49:02So my my other other than my
- 49:04like standard set, you know,
- 49:06race, class and gender is,
- 49:08I often ask people to to think about.
- 49:11Watch what variables might just be
- 49:13related to other things, right?
- 49:14That if you could.
- 49:15If you can think of it as like I
- 49:17ultimately want to test moderators
- 49:19that I don't really know exactly
- 49:21what they are,
- 49:22but I need to get variation in them,
- 49:24and that means probably by getting
- 49:26variation in something else.
- 49:27I'm going to get variation in those as well,
- 49:29so.
- 49:31The size of your site, you know,
- 49:33I think,
- 49:34is one place where you know an education.
- 49:37You can see that everybody's in
- 49:39very large sites.
- 49:40And So what if we increase the variation?
- 49:43The variation of district size
- 49:44and school size?
- 49:45It seems like has to increase
- 49:47variation of some other things as
- 49:49well. Agreed, agreed.
- 49:50Yeah, I think another aspect why I so
- 49:53appreciate like the aspect of effect
- 49:55modifiers is that it really is a way to
- 49:58move forward with information from Co.
- 50:00Buryats and then when we talk
- 50:02on 80 in a randomized study,
- 50:05we often ignore covariates and
- 50:06then just hold that the unadjusted
- 50:09analysis provides unbiased estimates,
- 50:11even though that may come with
- 50:13a larger variation.
- 50:14So by really talking about effect modifiers,
- 50:17we somehow incurve those information,
- 50:19but perhaps even in the estimation
- 50:21of the average affect,
- 50:22which can increase precision.
- 50:24So yeah.
- 50:27Yeah.
- 50:30Yeah I haven't questioned,
- 50:32so I actually have two questions,
- 50:35so seems you're you're interested
- 50:37in both individual level
- 50:39an cluster level moderators right? When
- 50:41you have cluster level moderators,
- 50:44how does that work with
- 50:46the augmentation design?
- 50:47'cause you mentioned that
- 50:49in the orientation design,
- 50:51you might want to pick like.
- 50:5410 or 30% of the sides.
- 50:56An kind of like choose
- 50:58them samples from those.
- 50:59But how do you choose those 3%?
- 51:02You choose those third percent with
- 51:04respect to the cluster level modelers.
- 51:06You could do it with respect to either.
- 51:09You can do it with respect there
- 51:12because it depends the way you
- 51:14enter them into the model so.
- 51:18You work out so you can work out that if
- 51:20if I'm interested in the individual level.
- 51:23Moderate are that that what I need
- 51:25to do is I need the I actually
- 51:27need to include as a covariate the
- 51:30interaction between like X an 1 -- X.
- 51:32That's what I included here
- 51:34as the covariates.
- 51:35I'm 'cause I want to increase the
- 51:37variation within sites right?
- 51:38And so you could do it either way,
- 51:41because what it's doing,
- 51:42what the augmentation approach does?
- 51:44Is it assess is how much variation
- 51:46you have in those 30 sites already.
- 51:49And then it looks for possible design runs,
- 51:52meaning other samples.
- 51:54Other places that would greatly improve that.
- 51:58And it just it doesn't algorithmically,
- 51:59which is nice.
- 52:00The that I would say I should add an
- 52:02extra benefit of this is concerned
- 52:03with all of this sample recruitment
- 52:05is that there's non response.
- 52:07You're never going to get,
- 52:08you know it's not like I can just say like.
- 52:11Here's your.
- 52:11Here's your like 40 sites go ask
- 52:13them and they're going to say yes,
- 52:15but with the augmentation approach if
- 52:16somebody says no you can like throw
- 52:18that out and then go look for it
- 52:20like what's the next best alternative
- 52:21so you can keep kind of iterating.
- 52:24So,
- 52:25so in our current application,
- 52:26I think the attributes are all cluster level.
- 52:30Information right summary statistics
- 52:31yeah yeah, well that's what
- 52:33I have right here.
- 52:34That's in the in the slides but
- 52:36I didn't include in here though
- 52:38is like you could it's but it's
- 52:40in the paper is you could also do
- 52:43this with individual level only.
- 52:44For proportions mean because just because
- 52:46the proportion works out that you can get.
- 52:48You can think about this with
- 52:50the same statistics you would
- 52:52get at the cluster level.
- 52:53You can't get the variation you can with
- 52:55a normal like a continuous variable.
- 52:57I can't get the.
- 52:59I don't have the standard deviation.
- 53:01Insights I can't do that,
- 53:03right, right?
- 53:05Also, the other question
- 53:06is about so it seems
- 53:08like all these designs are
- 53:09under the assumption that you're
- 53:11interested in all the moderators,
- 53:13like equally like meaning that you're
- 53:15not like you don't have like primary
- 53:17moderators that you're interested in
- 53:19estimating the moderate effect on and
- 53:20then you have a couple of them that.
- 53:23I mean, if you you
- 53:25can. So I mean, what's great?
- 53:27I mean, I think about this like this area
- 53:29is that it's been so richly developed
- 53:32in this other sort of design runway
- 53:34is that you can actually add weights.
- 53:36So you can say like I'm more like or
- 53:39more interested in this variable than
- 53:41that variable, and it will focus.
- 53:43You know it will focus on one
- 53:45variable over the other.
- 53:47Because you Can you imagine like
- 53:49that ask like that D matrix.
- 53:50The determinant of S.
- 53:52You could just add weights into that.
- 53:54So if you add weights into that
- 53:56then you can start looking at the
- 53:58determinant of that weighted version.
- 54:00Right, so you would add weights
- 54:02in that matrix and optimize that.
- 54:03Yeah exactly,
- 54:04if you add weight so that some
- 54:06of the Kobe rates are getting
- 54:07more weight than others.
- 54:10So I guess just maybe more precisely,
- 54:13I think the D optimality criteria.
- 54:16Shouldn't that be the X
- 54:18transpose V universe in general?
- 54:20Just because you're working with
- 54:22clustered randomized studies so
- 54:23that the outcome correlation is
- 54:25somehow included in that variance?
- 54:27Is that what the algorithm is
- 54:29trying to get in general for?
- 54:32Yeah yeah.
- 54:34Inverse, yeah, it's the X prime X inverse,
- 54:37which is the covariance. Yeah,
- 54:38but but really not so it you don't.
- 54:41You don't need to have the
- 54:42variance matrix of the outcome.
- 54:46Exactly, you don't need to have the outcome,
- 54:48it's all about the inputs, right?
- 54:50But that's, which is why you
- 54:51can do it in advance, right?
- 54:53So it's all about the Android just nicely.
- 54:56You can leverage population
- 54:57data that you have totally.
- 54:58And again, I assume in all of this
- 55:01that like there's measurement error
- 55:02and that you know you can just
- 55:04sort of assume that like you're
- 55:06not going to get it exactly right,
- 55:08but my baseline comparison is always
- 55:10what are we doing now versus what could
- 55:12we be doing an like frankly anything.
- 55:15Any you know it looks to me like we
- 55:17have fairly homogeneous samples and
- 55:19that any effort we can make to increase
- 55:22that heterogeneity is an improvement.
- 55:28So, well, I think we're about the hour,
- 55:31but let's see if we have any
- 55:33final questions from the audience.
- 55:40Alrighty, if not, I think you know I'm,
- 55:43I'm sure if you have any questions
- 55:45that petition will we have to
- 55:47answer them offline by email?
- 55:48So thanks so much. Again, bath.
- 55:50It's really nice to have you and thanks to
- 55:53everybody for attending or see all of you.
- 55:55Hopefully after the break so have
- 55:58a great holiday. See you later.
- 56:02Totally not master connect
- 56:04alright, thanks again.
- 56:05Talk to you later. Bye take care.