1 00:00:08,174 --> 00:00:12,445 Good evening, and welcome to tonight's lecture on Sample Size and Power. 2 00:00:12,445 --> 00:00:13,847 My name is Dr. 3 00:00:13,847 --> 00:00:14,914 Laura Lee Johnson. 4 00:00:14,914 --> 00:00:19,185 I'm an associate director at the United States Food and Drug Administration. 5 00:00:19,185 --> 00:00:22,389 So, the disclaimer, of course, nothing that I say 6 00:00:22,389 --> 00:00:26,826 the FDA wants to be construed as representing their views or policies. 7 00:00:26,826 --> 00:00:30,563 So, why do we care about sample size and power? 8 00:00:31,097 --> 00:00:32,899 Think about this a little bit. 9 00:00:32,899 --> 00:00:36,836 Now last week, you all heard a couple of lectures from Paul Wakim. 10 00:00:36,836 --> 00:00:41,641 This is actually a slide that he showed in another set of lectures that we gave. 11 00:00:41,641 --> 00:00:46,780 And I think it's a very nice way to kind of set the mood for this lecture. 12 00:00:46,780 --> 00:00:51,885 If you forget by the time we go through all the formulas in the next 90 minutes, 13 00:00:52,485 --> 00:00:56,423 power is basically the probability of getting a statistically significant result 14 00:00:56,423 --> 00:00:59,259 when, in fact, there's a clinically meaningful difference. 15 00:00:59,259 --> 00:01:02,462 So, this is kind of unknown to us, right? 16 00:01:02,462 --> 00:01:08,535 And part of what we're going to be discussing in that -- over the next several weeks 17 00:01:08,535 --> 00:01:11,738 is, how do you know that something's clinically meaningful? 18 00:01:11,738 --> 00:01:17,077 But this in essence, this idea of power, we want to have high power, right? 19 00:01:17,077 --> 00:01:22,582 We want to have a high probability of seeing a statistically significant result 20 00:01:22,582 --> 00:01:27,253 in those hypotheses tests from last week if, in fact, there 21 00:01:27,253 --> 00:01:29,589 is this clinically meaningful difference. 22 00:01:29,589 --> 00:01:34,461 So, by definition, those studies that have low power are less 23 00:01:34,461 --> 00:01:36,996 likely to produce statistically significant results, 24 00:01:36,996 --> 00:01:39,966 even when that clinically meaningful effect exists. 25 00:01:39,966 --> 00:01:44,604 And that's the reason your animal care committees and the human 26 00:01:44,604 --> 00:01:50,110 subjects committees are going to say, "Well, is it ethical to do research 27 00:01:51,144 --> 00:01:52,412 if you don't 28 00:01:52,412 --> 00:01:57,117 have that much power?" So, this becomes not just a, "I 29 00:01:57,117 --> 00:02:01,821 want to have significant findings question." It's also an ethical question. 30 00:02:01,821 --> 00:02:09,062 Is it worthwhile to put people at risk or their data or their privacy even at risk 31 00:02:09,062 --> 00:02:15,468 if you're not going to be able to tell that, as one of my colleagues 32 00:02:15,468 --> 00:02:21,441 said, "There's a there there?" Now, the flip of this in how power 33 00:02:21,441 --> 00:02:26,579 and sample size and statistical significance tied together is that the lack 34 00:02:26,579 --> 00:02:31,251 of statistical significance doesn't prove that there is no treatment effect, 35 00:02:31,251 --> 00:02:37,657 because it could be a consequence of small sample size or really its low power. 36 00:02:39,492 --> 00:02:43,830 So, it's important to have enough power and an adequate sample size. 37 00:02:43,830 --> 00:02:45,031 So, our objectives 38 00:02:45,031 --> 00:02:49,903 for this lecture, to calculate changes in sample size based on the changes in 39 00:02:49,903 --> 00:02:54,407 the difference of interest, and the variance, or the number of study arms. 40 00:02:54,407 --> 00:02:56,843 So, we'll talk about each of those. 41 00:02:56,843 --> 00:02:57,544 We'll understand 42 00:02:57,544 --> 00:03:01,681 the intuition behind power calculations, hopefully, by the end of tonight's lecture. 43 00:03:01,681 --> 00:03:05,151 Be able to recognize some of the common sample size 44 00:03:05,151 --> 00:03:08,621 formulas for the tests that were talked about last week. 45 00:03:09,556 --> 00:03:13,193 And learn some tips for getting through the IRB. 46 00:03:13,193 --> 00:03:18,331 Your general takeaway message is you need to get input from a statistician. 47 00:03:18,331 --> 00:03:22,936 It's not just because I am one, but because the ramifications of 48 00:03:22,936 --> 00:03:27,507 not continually talking to your statisticians and epidemiologists can be pretty big. 49 00:03:27,507 --> 00:03:32,078 You can waste a lot of money and a lot of time. 50 00:03:32,078 --> 00:03:35,148 All calculations, after I say calculations are important, 51 00:03:35,648 --> 00:03:40,253 you kind of have to take them with a few grains of salt. 52 00:03:40,253 --> 00:03:45,592 Most of the basic formulas that you will find are, like, the lowest level possible. 53 00:03:45,592 --> 00:03:50,563 But in reality, we have these kind of what we call in the U.S. 54 00:03:50,563 --> 00:03:54,100 "fudge factors." So, realistically, people are going to drop out. 55 00:03:54,100 --> 00:03:59,405 Maybe the variance of that endpoint is going to be larger than you were anticipating. 56 00:03:59,405 --> 00:04:02,875 Maybe accrual will be slower than you were anticipating. 57 00:04:02,875 --> 00:04:05,178 A lot of things can happen. 58 00:04:05,178 --> 00:04:09,415 And so, we have to take that into account when we're 59 00:04:09,415 --> 00:04:13,653 trying to figure out what our actual sample size should be. 60 00:04:13,653 --> 00:04:19,792 The other trick is, you will always, always, always round up on sample size, never down. 61 00:04:19,792 --> 00:04:20,927 Why is that? 62 00:04:20,927 --> 00:04:25,565 Because if your calculation tells you that you need 10.01 human beings 63 00:04:25,565 --> 00:04:30,169 in your study, there is no one one-hundredth of a human being. 64 00:04:30,303 --> 00:04:35,808 You need 11 people in your study, or your power goes down. 65 00:04:35,808 --> 00:04:42,215 We'll talk a little bit more about that as we progress through the lecture. 66 00:04:42,215 --> 00:04:46,586 But also, always remember that analysis follows the design. 67 00:04:46,586 --> 00:04:49,555 So, few other take home messages. 68 00:04:49,555 --> 00:04:54,594 You really need to give thought and have a broad discussion 69 00:04:54,594 --> 00:04:59,198 about what difference is scientifically important and in what units. 70 00:04:59,198 --> 00:05:02,201 Are we talking inches or centimeters? Why? 71 00:05:02,201 --> 00:05:07,507 It can actually -- you know, the sample size calculations are not invariant 72 00:05:07,507 --> 00:05:12,812 as a mathematical term, but the units matter, and also the measurement error. 73 00:05:12,812 --> 00:05:18,551 Remember I mentioned at one point, do I use, like, a little plastic ruler? 74 00:05:18,551 --> 00:05:23,056 Or how other method could I use to measure my waist? 75 00:05:23,056 --> 00:05:27,560 Think about how you're going to be measuring, because that information 76 00:05:27,560 --> 00:05:30,830 has to go into your sample size calculation. 77 00:05:31,564 --> 00:05:35,535 Are you taking a single measure of systolic blood pressure, 78 00:05:35,535 --> 00:05:37,904 or are you averaging three together? 79 00:05:37,904 --> 00:05:43,009 This information is going to matter when you're calculating power and sample size. 80 00:05:43,009 --> 00:05:45,011 How variable are the measurements? 81 00:05:45,011 --> 00:05:49,349 Again, my favorite plastic ruler does not measure my waist well. 82 00:05:51,818 --> 00:05:56,155 So, in sample size, you need to think of a few elements. 83 00:05:56,155 --> 00:06:00,460 One is, what is the difference or the effect to be detected? 84 00:06:00,460 --> 00:06:04,397 This might be a ratio. It might be an actual difference. 85 00:06:04,397 --> 00:06:06,566 This element needs to be known. 86 00:06:06,566 --> 00:06:10,169 You need to think about the variance in the outcome. 87 00:06:10,169 --> 00:06:13,406 You need to understand what significance level you want. 88 00:06:13,740 --> 00:06:18,678 So, is it a one-tailed or a two-tailed test is also an important part 89 00:06:18,678 --> 00:06:22,215 of all of these equations. What power do you want? 90 00:06:22,215 --> 00:06:26,085 What level of Type II error are you willing to allow? 91 00:06:28,121 --> 00:06:30,990 Do you want equal randomization to different 92 00:06:30,990 --> 00:06:35,094 study arms or unequal randomization to all the study arms? 93 00:06:35,094 --> 00:06:39,599 Are you thinking about doing a superiority, equivalence, or non-inferiority trial? 94 00:06:39,599 --> 00:06:42,869 All of these have different sample size calculations 95 00:06:42,869 --> 00:06:47,974 or different numbers that you need to put in for the calculations. 96 00:06:47,974 --> 00:06:52,278 Other elements that come up will be a follow-up period. 97 00:06:52,278 --> 00:06:54,747 How long a participant is followed? 98 00:06:54,747 --> 00:06:58,618 One of our examples will talk a little bit about that. 99 00:06:58,618 --> 00:07:01,154 We'll also talk a little bit about censoring. 100 00:07:01,154 --> 00:07:06,859 This is when a participant has been in the trial for a while, you have all their data, 101 00:07:06,859 --> 00:07:09,729 but now that participant is no longer being followed. 102 00:07:09,729 --> 00:07:12,932 So, you have what's called incomplete follow-up which is common. 103 00:07:12,932 --> 00:07:15,134 And sometimes we also do administrative censoring. 104 00:07:15,134 --> 00:07:19,605 So, let's say I'm trying to follow people and find out when they die. 105 00:07:20,139 --> 00:07:23,209 Well, it's the end of my study, and they're still alive. 106 00:07:23,209 --> 00:07:26,813 So, at the end of study I may administratively censor them saying, "I 107 00:07:26,813 --> 00:07:29,315 knew they were alive up until this time point." 108 00:07:29,315 --> 00:07:32,218 We'll talk more about this in the survival analysis lecture. 109 00:07:32,218 --> 00:07:35,121 So, we're going to start off tonight talking about power, 110 00:07:35,121 --> 00:07:38,458 go through some basic sample size information, a whole bunch of examples. 111 00:07:38,458 --> 00:07:41,494 But there are going to be more examples in the textbook. 112 00:07:42,628 --> 00:07:43,629 I'm going to 113 00:07:43,629 --> 00:07:48,434 talk about some of the few changes you can make to the basic formula. 114 00:07:48,434 --> 00:07:51,504 That's also covered more in depth in the textbook. 115 00:07:51,504 --> 00:07:54,941 You also will do a little bit of multiple comparisons. 116 00:07:54,941 --> 00:07:59,745 You've already talked about that sum, and other people will talk about it more. 117 00:07:59,745 --> 00:08:03,850 So, I'll try to race through that to get to what St. 118 00:08:03,850 --> 00:08:07,587 George's College has put out in their information to researchers. As 119 00:08:07,587 --> 00:08:12,725 these are really poor sample size statements, please never send them to our IRB again. 120 00:08:15,661 --> 00:08:18,331 So, again, power depends on your sample size. 121 00:08:18,331 --> 00:08:22,368 A lot of people ask me why we don't have power formulas. 122 00:08:22,368 --> 00:08:25,071 Well, they are there, but they're algebraically intensive. 123 00:08:25,071 --> 00:08:28,074 It's actually easier to teach the sample size formula. 124 00:08:28,074 --> 00:08:32,778 You can work backwards to figure out a power formula if you need it. 125 00:08:32,778 --> 00:08:35,815 But either way, you know, the math will work. 126 00:08:36,849 --> 00:08:41,220 But again, this idea about power is it's the probability 127 00:08:41,220 --> 00:08:46,025 of rejecting the null hypothesis if the alternative hypothesis is true. 128 00:08:46,025 --> 00:08:51,063 Typically, the more subjects you have, the higher power you have. 129 00:08:51,063 --> 00:08:53,466 So, what's power affected by? 130 00:08:53,466 --> 00:08:56,102 I already mentioned the sample size. 131 00:08:56,102 --> 00:08:59,171 But variation in that outcome is huge. 132 00:08:59,171 --> 00:09:05,711 If I can lower my variance, I increase my power, all other things being equal. 133 00:09:07,380 --> 00:09:13,019 If I increase the significance level, so if I say, "Oh, instead of a Type 134 00:09:13,019 --> 00:09:17,890 I error of 0.05, I could have 0.1," my power will go up. 135 00:09:17,890 --> 00:09:23,529 And if I'm interested in different effects, it's always easier to detect a bigger effect. 136 00:09:23,529 --> 00:09:28,801 If I increase the difference I'm willing to detect, my power will go up. 137 00:09:28,801 --> 00:09:31,437 But again, it has to be reasonable. 138 00:09:31,771 --> 00:09:36,642 If you tell me that you're going to lower someone's average systolic blood pressure 139 00:09:36,642 --> 00:09:42,949 by 50 points, by 50 units, I'm going to say you are crazy, and that can't be done. 140 00:09:42,949 --> 00:09:47,320 So, you have to be careful that your difference is actually reasonable. 141 00:09:47,320 --> 00:09:48,921 One-tailed versus two-tailed tests. 142 00:09:48,921 --> 00:09:50,323 For the same alpha, 143 00:09:50,323 --> 00:09:54,860 your power is greater in a one-tailed test than a comparable two-tailed test. 144 00:09:55,227 --> 00:10:00,299 Because remember, in a two-tailed test, I have to split it onto two different sides. 145 00:10:00,299 --> 00:10:02,335 So, here are a few examples. 146 00:10:02,335 --> 00:10:05,071 Let's say that I have a two-arm study. 147 00:10:05,071 --> 00:10:07,773 I have a total sample size of 32. 148 00:10:07,773 --> 00:10:10,843 So, I have 16 people in each study arm. 149 00:10:10,843 --> 00:10:13,212 So, this is a small little study. 150 00:10:13,212 --> 00:10:18,985 Some people may think it's like an R21 or an R03 if you get NIH's extramural funding. 151 00:10:20,786 --> 00:10:22,622 81 percent power, my delta 152 00:10:22,622 --> 00:10:27,460 is equal to 2, so the difference I want to detect is 2. 153 00:10:27,460 --> 00:10:30,062 I set my standard deviation at 2. 154 00:10:30,062 --> 00:10:33,032 The Type I error, I set at 0.05. 155 00:10:33,032 --> 00:10:36,002 And I want to do a two-sided test. 156 00:10:36,002 --> 00:10:41,407 So, at this sample size with all this information, I have 81 percent power. 157 00:10:41,407 --> 00:10:44,910 Let's say I was wrong on the standard deviation. 158 00:10:45,211 --> 00:10:47,613 Instead it was 1, not 2. 159 00:10:47,613 --> 00:10:52,385 The good news is my power has gone up to 99.9 percent. 160 00:10:52,385 --> 00:10:55,955 Problem is normally we are in this second situation 161 00:10:55,955 --> 00:11:01,894 where we thought it was going to be 2, it turns out it was 3. 162 00:11:01,894 --> 00:11:06,666 2 and 3 sounds so close together, but not in power land. 163 00:11:06,666 --> 00:11:09,835 Now, my power has dropped to 47 percent. 164 00:11:09,835 --> 00:11:13,005 We'll talk about why in a few minutes. 165 00:11:13,673 --> 00:11:14,974 The significance level. 166 00:11:14,974 --> 00:11:18,844 I was going to plan my study for 0.05. 167 00:11:18,844 --> 00:11:23,582 Now, they're saying, "No, you have all these things going on. 168 00:11:23,582 --> 00:11:28,320 We think you should test that 0.01." Your power, instead of 169 00:11:28,320 --> 00:11:32,658 being 81 percent, is 69 percent when you do that. 170 00:11:32,658 --> 00:11:37,396 You cannot change your alpha level without changing your sample size 171 00:11:37,396 --> 00:11:40,433 if you want to maintain your power. 172 00:11:40,433 --> 00:11:46,906 But let's say instead you are going to power all of these interactions at 0.05, 173 00:11:46,906 --> 00:11:52,511 then somebody said, "No." You can test these interactions at 0.1, not 0.05. 174 00:11:52,511 --> 00:11:57,683 Well, your power has now gone up, 81 percent to 94 percent. 175 00:11:58,284 --> 00:11:59,118 That's pretty cool. 176 00:12:02,054 --> 00:12:05,958 Let's say the difference to be detected, you wanted to detect 177 00:12:05,958 --> 00:12:09,829 a difference of two hours of sleep with your new intervention. 178 00:12:09,829 --> 00:12:14,066 You have this behavioral intervention that you think people will sleep longer. 179 00:12:14,066 --> 00:12:16,569 You cut down one hour of sleep. 180 00:12:16,569 --> 00:12:20,439 Your power to detect one hour of sleep is 29 percent. 181 00:12:20,439 --> 00:12:22,208 Not such a smart thing. 182 00:12:22,208 --> 00:12:27,880 You kind of want to figure out, do you care about two hours or one hour? 183 00:12:27,880 --> 00:12:31,751 Let's say instead of two hours, you go for three hours. 184 00:12:32,151 --> 00:12:38,624 These people are getting no sleep, and you just want to see, does it show anything? 185 00:12:38,624 --> 00:12:42,261 Well, your power has gone up to 99 percent. 186 00:12:42,261 --> 00:12:47,933 But does your intervention really cause people to sleep three more hours a night? 187 00:12:47,933 --> 00:12:48,968 Probably not. 188 00:12:48,968 --> 00:12:52,404 Let's say like I admit I did once. 189 00:12:52,404 --> 00:12:58,878 You thought the sample size of 32 was per arm and not total for your study. 190 00:12:58,878 --> 00:13:02,414 So, you actually had 64 people in the study. 191 00:13:02,414 --> 00:13:05,951 The good news is your power is 98 percent. 192 00:13:05,951 --> 00:13:12,625 The bad news is they just spent a lot of extra money they didn't need to spend. 193 00:13:12,625 --> 00:13:17,329 The problem, however, is we always find ourselves in this other situation. 194 00:13:17,329 --> 00:13:20,065 Sample size is supposed to be 32. 195 00:13:20,065 --> 00:13:22,034 We're running out of money. 196 00:13:22,034 --> 00:13:27,907 We're having trouble recruiting, and we ended up with only 28 people in the study. 197 00:13:29,441 --> 00:13:33,212 So, our power that was 81 percent -- and typically, 198 00:13:33,212 --> 00:13:38,117 the reason I chose 81 percent is because 80 percent is usually considered 199 00:13:38,117 --> 00:13:43,422 kind of a minimum for a study, minimum level of power, although that varies. 200 00:13:43,422 --> 00:13:46,058 We'll talk a little bit about that. 201 00:13:46,058 --> 00:13:50,229 But what happens here is my power is now 75 percent. 202 00:13:50,229 --> 00:13:55,134 So, I don't have 80 percent of a chance to see statistical significance 203 00:13:55,134 --> 00:13:57,770 when I have something that's clinically meaningful. 204 00:13:57,770 --> 00:14:00,472 I only have a 75 percent chance. 205 00:14:00,472 --> 00:14:03,008 Maybe that's okay, but maybe not. 206 00:14:03,008 --> 00:14:06,278 Then we have these two-tailed versus one-tailed test. 207 00:14:06,278 --> 00:14:13,152 If I switch to a one-tailed test, a boost up a little bit to 88 percent power. 208 00:14:13,152 --> 00:14:17,523 So, that gets us to what should your power actually be. 209 00:14:17,523 --> 00:14:21,794 So, again, that Phase III minimum tends to be 80 percent. 210 00:14:21,794 --> 00:14:27,967 There are other people that will say, "Your Type I error should equal your Type II 211 00:14:27,967 --> 00:14:34,340 error." A lot of these large, huge definitive trials might have 99.9 percent power. 212 00:14:34,340 --> 00:14:39,778 I have other folks that say, "Well, I'm just doing a proof of concept, 213 00:14:39,778 --> 00:14:46,785 and they have 60 to 80 percent power." In some of the omics studies that I looked at, 214 00:14:46,785 --> 00:14:52,591 they have really high power, because Type II error is such a problem for them. 215 00:14:56,795 --> 00:14:59,431 So, what is your power formula? 216 00:14:59,431 --> 00:15:02,501 It all depends on your study design. 217 00:15:02,501 --> 00:15:04,236 It's just algebraically intensive, 218 00:15:04,236 --> 00:15:09,942 and you should probably use a computer program to figure out your power. 219 00:15:09,942 --> 00:15:14,747 Or make a statistician run a bunch of simulations for you 220 00:15:14,747 --> 00:15:18,017 to figure out your range of power. 221 00:15:18,017 --> 00:15:23,956 So, this is kind of our basic sample size 101, where everything starts. 222 00:15:23,956 --> 00:15:26,558 Flipping from power to sample size, 223 00:15:27,960 --> 00:15:29,862 same elements are in play. 224 00:15:29,862 --> 00:15:34,366 Changes in the difference of interest are going to have huge impacts 225 00:15:34,366 --> 00:15:38,704 on your sample size, and so are changes in the variance. 226 00:15:38,704 --> 00:15:40,005 Some examples here. 227 00:15:40,005 --> 00:15:45,277 If you have a 20-point difference that you designed for, maybe you need 25 228 00:15:45,277 --> 00:15:46,378 patients per group. 229 00:15:46,378 --> 00:15:51,283 If I change nothing else but this point difference, if I drop it 230 00:15:51,283 --> 00:15:55,421 by half, to 10, I now need 100 patients per group. 231 00:15:55,421 --> 00:16:01,627 If it drops again to a 5-point difference, I now need 400 patients per group. 232 00:16:01,627 --> 00:16:07,132 All the types of things that feed into just your basic study 233 00:16:07,132 --> 00:16:11,704 design, a basic randomized two-arm study changes in the difference 234 00:16:11,704 --> 00:16:18,544 to be detected, your alpha or that Type I error, your beta, or one minus 235 00:16:18,544 --> 00:16:24,049 beta, your power, the standard deviation, the number of samples you're taking. 236 00:16:24,049 --> 00:16:27,720 Is it cluster randomized or non-cluster randomized studies? 237 00:16:28,454 --> 00:16:31,523 Is it a one-sided or two-sided test? 238 00:16:31,523 --> 00:16:36,962 All of these can have huge impacts on your sample size calculation. 239 00:16:36,962 --> 00:16:42,434 So, your basic formula for the total sample of a two-arm study 240 00:16:42,434 --> 00:16:47,239 that's just simple, basic individual randomized study, nothing special is that 241 00:16:47,239 --> 00:16:52,044 2N, where N is the sample size for a single arm, 242 00:16:52,044 --> 00:16:56,849 would be 4 times -- this is as critical value Z. 243 00:16:57,649 --> 00:16:59,385 These are on that bell curve. 244 00:16:59,385 --> 00:17:01,954 Your computer will tell you what this value is. 245 00:17:01,954 --> 00:17:06,592 Or if you have a really old stats textbook, there's a picture of the bell curve 246 00:17:06,592 --> 00:17:09,762 in the back that tells you. 1 minus alpha over 2. 247 00:17:09,762 --> 00:17:10,896 Why is this alpha? 248 00:17:10,896 --> 00:17:14,366 This is your Type I error that you are getting to allow. 249 00:17:14,366 --> 00:17:16,668 Divided by 2 if it's a two-sided test. 250 00:17:17,803 --> 00:17:18,670 So, for 251 00:17:18,670 --> 00:17:22,975 a 0.05 test, that's two-sided, this value is like 1.96. 252 00:17:22,975 --> 00:17:25,110 Z of 1 minus beta. 253 00:17:25,110 --> 00:17:28,981 This 1 minus beta is that power, all right? 254 00:17:28,981 --> 00:17:33,285 So, if it's 80 percent power, I think that's 1.645. 255 00:17:33,285 --> 00:17:38,891 Again, it's in your textbook, we actually wrote some of the numbers down. 256 00:17:38,891 --> 00:17:43,595 You take these two Z critical values, and you square them. 257 00:17:43,595 --> 00:17:46,198 You multiply it by the variance, 258 00:17:47,699 --> 00:17:50,869 divided by the difference you're entrusted and squared. 259 00:17:50,869 --> 00:17:55,874 Now, all the different formulas for sample size for all the different study 260 00:17:55,874 --> 00:17:59,244 designs and regressions, et cetera, they all look different. 261 00:17:59,244 --> 00:18:03,916 But these basic elements will be in every single one of them. 262 00:18:03,916 --> 00:18:08,954 So, what do you need to think about before taking all this information 263 00:18:08,954 --> 00:18:10,055 to your statistician? 264 00:18:10,055 --> 00:18:13,792 What do you even need to talk to them about 265 00:18:14,226 --> 00:18:17,996 in addition to the basic background for your projects? 266 00:18:17,996 --> 00:18:22,134 First off, do you have a randomized or non-randomized study? 267 00:18:22,134 --> 00:18:25,304 Non-randomized studies require a lot larger sample size, 268 00:18:25,304 --> 00:18:30,442 because remember, we've got all those potential confounders we've got to look at. 269 00:18:30,442 --> 00:18:32,010 Similarly, the effect modifiers. 270 00:18:32,010 --> 00:18:36,381 We have to account for that in the sample size planning. 271 00:18:36,381 --> 00:18:41,120 Occasionally in surveys -- so, Barbara Stussman may talk a little bit 272 00:18:41,120 --> 00:18:46,859 more about this in a few weeks, absolute sample size may be of interest. 273 00:18:46,859 --> 00:18:51,730 So, if you're doing a survey, sometimes we take a percent of population approach. 274 00:18:51,730 --> 00:18:56,268 We may say you have the given threshold in your confidence interval around, 275 00:18:56,268 --> 00:18:59,404 your percent needs to be above a certain threshold. 276 00:18:59,404 --> 00:19:02,007 So, you're trying to plan around that. 277 00:19:02,007 --> 00:19:06,712 So, you've got to keep in mind what is the study's primary outcome. 278 00:19:07,379 --> 00:19:10,816 That's your basis for your sample size calculation. 279 00:19:10,816 --> 00:19:15,988 That said, secondary or other outcome variables, if you think they're important, 280 00:19:15,988 --> 00:19:22,427 make sure you have sufficient sample size to be able to answer those key questions. 281 00:19:22,427 --> 00:19:27,166 Increase the real sample size for your study to reflect things 282 00:19:27,166 --> 00:19:32,771 like loss to follow up, expected response rates, lack of compliance, et cetera. 283 00:19:32,771 --> 00:19:37,509 You can't do this, you know, just off the cuff though. 284 00:19:37,509 --> 00:19:42,347 You have to write into your human subjects applications and your grant applications 285 00:19:42,347 --> 00:19:45,717 how you are justifying those kind of plus-ups, right? 286 00:19:48,086 --> 00:19:50,155 And again, always round up. 287 00:19:50,155 --> 00:19:54,927 So, let's say that you're going to do your sample size calculation, 288 00:19:54,927 --> 00:19:58,330 and you're looking through books and you're looking online, 289 00:19:58,330 --> 00:20:02,134 which you're going to probably find are sample size calculations 290 00:20:02,134 --> 00:20:06,705 for two groups, a continuous outcome, and look at the mean difference. 291 00:20:06,705 --> 00:20:11,276 We're going to talk about that mainly tonight, because the similar ideas 292 00:20:11,276 --> 00:20:14,680 hold for different types of outcomes and study designs. 293 00:20:14,680 --> 00:20:20,385 But do understand a lot of those formulas will not be applicable to your research, 294 00:20:20,652 --> 00:20:22,654 because you're answering a different question. 295 00:20:25,490 --> 00:20:29,294 So, to your lovely person helping you with your sample size 296 00:20:29,294 --> 00:20:33,832 calculation, you need to tell them what are all the variables of interest 297 00:20:33,832 --> 00:20:35,567 that you want to measure. 298 00:20:35,567 --> 00:20:39,371 What's the type of data? Is it continuous? Is it categorical? 299 00:20:39,371 --> 00:20:42,841 Is it ordinal? Bring those some examples, if you can. 300 00:20:42,841 --> 00:20:44,576 What is the desired power? 301 00:20:44,576 --> 00:20:47,379 Where are you in this continuum of research? 302 00:20:47,379 --> 00:20:51,883 And what is the power that's going to be expected in your field? 303 00:20:52,651 --> 00:20:54,853 What is the desired significance level? 304 00:20:54,853 --> 00:20:56,521 Along those same lines. 305 00:20:56,521 --> 00:20:59,691 What's the effect or difference that's clinically important? 306 00:20:59,691 --> 00:21:02,294 Very rarely is this a single number. 307 00:21:02,294 --> 00:21:07,466 A lot of times, what you're looking for is you're going to say, "Okay. 308 00:21:07,466 --> 00:21:11,169 What seems to be feasible given the research that's done? 309 00:21:11,169 --> 00:21:13,405 What seemed to be consensus reports?" 310 00:21:13,405 --> 00:21:18,210 Maybe you have animal data, you're trying to extrapolate it to human beings. 311 00:21:18,877 --> 00:21:21,046 Maybe you have nothing at all. 312 00:21:21,046 --> 00:21:24,683 And we'll talk about those. But what seems to matter? 313 00:21:24,683 --> 00:21:26,118 This is the reason 314 00:21:26,118 --> 00:21:31,523 I don't always like the effect size elements where they just look at the ratio 315 00:21:31,523 --> 00:21:32,624 of the variance 316 00:21:32,624 --> 00:21:37,296 and the difference, because -- or rather, the standard deviation and the difference. 317 00:21:37,296 --> 00:21:43,101 The problem when you use that for your effect size is you haven't figured out like, 318 00:21:43,101 --> 00:21:44,970 what's the actual variance? 319 00:21:44,970 --> 00:21:49,608 And does that effect size actually translate to a difference that's relevant? 320 00:21:49,608 --> 00:21:53,512 So, think about those standard deviations for your continuous outcomes. 321 00:21:53,512 --> 00:21:59,318 But if at all possible, try to have real data on both of those bullets. 322 00:21:59,318 --> 00:22:03,188 And decide if you're doing a one or two-sided test. 323 00:22:03,188 --> 00:22:04,356 Realistically, you're going 324 00:22:04,356 --> 00:22:09,594 to have to have a really good justification to do a one-sided test. 325 00:22:09,594 --> 00:22:11,730 What is your data structure? 326 00:22:12,497 --> 00:22:14,166 Are we talking paired data? 327 00:22:14,166 --> 00:22:17,803 Are you doing pre-post on folks? Are you taking repeated measures? 328 00:22:17,803 --> 00:22:22,774 Are you going to take this blood pressure measurement 16 times over a two-year period? 329 00:22:22,774 --> 00:22:25,410 Are you looking at groups of equal sizes? 330 00:22:25,410 --> 00:22:28,413 Or are you going to do a 2-to-1 randomization? 331 00:22:28,413 --> 00:22:34,052 Or maybe it's a case control study, and you're going to get five controls for every case. 332 00:22:35,320 --> 00:22:38,190 Do you have hierarchical or nested data? 333 00:22:38,190 --> 00:22:43,929 Is that inside individually randomized folks, or are you doing a cluster randomized trial 334 00:22:43,929 --> 00:22:48,834 where you're going to have people inside of schools, inside of regions? 335 00:22:48,834 --> 00:22:50,869 Are you looking at biomarkers? 336 00:22:50,869 --> 00:22:55,374 Or are you trying to actually do and develop a biomarker? 337 00:22:55,374 --> 00:23:00,679 Because there's a lot of stuff that can be involved in the laboratory 338 00:23:00,679 --> 00:23:05,350 assay part, and very different equations, if you're in fact trying to validate 339 00:23:05,350 --> 00:23:09,154 a biomarker versus using it as essentially the endpoint in your trial. 340 00:23:11,223 --> 00:23:15,927 And how much validation information do you have about any of these endpoints? 341 00:23:15,927 --> 00:23:19,531 How do you know how they perform in your population? 342 00:23:22,501 --> 00:23:25,604 Your study design is going to really matter. 343 00:23:25,604 --> 00:23:29,074 I've talked a little bit about the randomization aspects. 344 00:23:29,074 --> 00:23:34,513 You need to talk to them about how you're going to randomize your trial. 345 00:23:34,513 --> 00:23:38,383 But again, there are all different formulations, whether it's equivalence. 346 00:23:38,383 --> 00:23:42,254 So, again, how then do you define the equivalence bound? 347 00:23:42,254 --> 00:23:45,757 If it's non-inferiority, what's that non-inferiority bound versus superiority? 348 00:23:45,757 --> 00:23:48,460 If you have a non-randomized intervention study, 349 00:23:48,460 --> 00:23:53,899 what other variables are you measuring that will need to go into the model? 350 00:23:53,899 --> 00:23:56,234 So, prevalence or observational studies. 351 00:23:56,234 --> 00:24:01,206 Are you trying to figure out positive and negative predictive values? 352 00:24:01,206 --> 00:24:02,574 Sensitivity and specificity? 353 00:24:02,574 --> 00:24:05,277 All of these have different calculations.