1 00:00:09,309 --> 00:00:12,178 Statistics, you guys have heard a lot about. 2 00:00:12,178 --> 00:00:16,449 And so, I'm going to focus less on the methods in stats 3 00:00:16,449 --> 00:00:21,788 and a little bit more on kind of the practical implications of those statistical choices. 4 00:00:21,788 --> 00:00:25,358 And so, again, these are all detailed in your protocol 5 00:00:25,358 --> 00:00:32,465 so that you don't get to the end of the study and go, "What did we want to look at? 6 00:00:33,366 --> 00:00:35,535 What was it that we were going to analyze? 7 00:00:35,535 --> 00:00:38,671 We had this great idea." And then you go back to your application 8 00:00:38,671 --> 00:00:41,074 and you're like, "Yeah, we described that in two paragraphs 9 00:00:41,074 --> 00:00:45,178 because that was all the space we had." So, this is where you have the opportunity. 10 00:00:45,178 --> 00:00:47,580 Lay it all out, your analytic plan can be done, 11 00:00:47,580 --> 00:00:50,950 so that as soon as the data is clean, your biostatistician can run it 12 00:00:50,950 --> 00:00:53,119 because you already know what you're going to do. 13 00:00:53,353 --> 00:00:58,224 So, it saves you a lot of time in the end if you have this 14 00:00:58,224 --> 00:01:00,160 all spelled out ahead of time. 15 00:01:00,160 --> 00:01:03,730 And more importantly, what we're seeing in the open science trends 16 00:01:03,730 --> 00:01:08,568 and for rigor and reproducibility, is that people are having to put their a priori 17 00:01:08,568 --> 00:01:11,504 hypothesis and in some cases, analytic plans out there 18 00:01:11,504 --> 00:01:14,074 and -- before they even start enrolling participants. 19 00:01:14,074 --> 00:01:18,511 And so, journals are now comparing back to, "Well, what did you say 20 00:01:18,511 --> 00:01:23,650 your primary outcome was going to be and is that actually what you're reporting in 21 00:01:23,650 --> 00:01:28,621 your primary paper here?" So, it's important to get those things specified and detailed. 22 00:01:28,621 --> 00:01:31,157 I mentioned before that when I think 23 00:01:31,157 --> 00:01:35,261 I was talking about blood pressure and how you might collect that. 24 00:01:35,762 --> 00:01:40,934 So, this is an area of your protocol where you define really clear 25 00:01:40,934 --> 00:01:42,902 how you're defining your outcomes. 26 00:01:42,902 --> 00:01:47,674 So, for cancer studies, you might have an appointed panel of experts 27 00:01:47,674 --> 00:01:52,812 that actually review the pathology slides to determine whether or not a person 28 00:01:52,812 --> 00:01:54,013 actually has cancer 29 00:01:54,013 --> 00:01:59,152 or not or meets the definition of nonalcoholic steatohepatitis for a liver study. 30 00:01:59,152 --> 00:02:03,923 You may actually have a group of individuals who review those blinded 31 00:02:03,923 --> 00:02:09,062 and make a determination as to whether they think that disease is present 32 00:02:09,062 --> 00:02:11,498 or not in the sample that they receive. 33 00:02:11,498 --> 00:02:13,166 They might see an imaging scan 34 00:02:13,166 --> 00:02:17,871 and make a determination as to whether or not -- you know, what is the size tumor? 35 00:02:17,871 --> 00:02:19,239 Has it shrunk or not? 36 00:02:19,239 --> 00:02:22,709 And so, they actually give the measurements and those kinds of things. 37 00:02:22,709 --> 00:02:27,247 So, you want to define how -- for your study, you're going to define your outcome. 38 00:02:27,714 --> 00:02:30,049 And I really do encourage people 39 00:02:30,049 --> 00:02:34,721 to use the standard definitions in the field, the agreed upon either 40 00:02:34,721 --> 00:02:39,392 diagnostic criteria or methods that have been published by a given field. 41 00:02:39,392 --> 00:02:45,265 And so, one way to find those is to look for other large clinical trials 42 00:02:45,265 --> 00:02:51,104 that have been done in a particular area and see what definitions they have used. 43 00:02:51,104 --> 00:02:55,775 The clinicaltrials.gov resource that lists all of the trials that are ongoing 44 00:02:55,775 --> 00:03:01,247 is a great resource to be able to look at inclusion and exclusion criteria 45 00:03:01,247 --> 00:03:07,487 for other studies, so that you can see that you're enrolling the same kinds of patients 46 00:03:07,487 --> 00:03:09,055 as other trials are. 47 00:03:09,055 --> 00:03:14,494 And also, the definitions of their outcomes so that you can use similar outcome 48 00:03:14,494 --> 00:03:19,332 definitions definitions in your trial and that you're not creating something de novo 49 00:03:19,332 --> 00:03:24,537 and totally new that you've come up with that would be difficult to include 50 00:03:24,537 --> 00:03:27,507 in a meta-analysis later because you didn't use 51 00:03:27,507 --> 00:03:32,145 an agreed upon definition that the rest of the field is using. 52 00:03:32,145 --> 00:03:37,517 So, for example, the studies that I have listed here are the Women's Health 53 00:03:37,517 --> 00:03:41,588 Initiative, the Systolic Hypertension in the Elderly Program which has definitions 54 00:03:41,588 --> 00:03:44,924 for how do you define hypertension in older adults, 55 00:03:45,592 --> 00:03:48,628 the Studies of Left Ventricular Dysfunction, the SOLVD study. 56 00:03:48,628 --> 00:03:53,066 They have very clear definitions of how they've identified what is an 57 00:03:53,066 --> 00:03:57,103 MI, what is a myocardial infarction, what is a heart attack. 58 00:03:57,103 --> 00:04:02,308 And so, those are really good resources to look at those very large trials 59 00:04:02,308 --> 00:04:07,113 that have spent a lot of time determining how they're going to make 60 00:04:07,113 --> 00:04:11,718 those determinations as to what the definitions are of the primary outcomes. 61 00:04:11,718 --> 00:04:15,989 One of the things that can be a challenge is specifying 62 00:04:15,989 --> 00:04:19,659 not only what data are you going to collect, but in what units are you going to collect it. 63 00:04:19,659 --> 00:04:24,697 I used to show a slide and I don't think it's in this slide deck 64 00:04:24,697 --> 00:04:26,733 anymore, where people had collected height 65 00:04:26,733 --> 00:04:30,770 and weight in different measures, and some of them were in -- 66 00:04:30,770 --> 00:04:34,140 weight was in kilograms and sometimes it was in pounds, 67 00:04:34,140 --> 00:04:37,844 that you had different staff writing it down in different ways. 68 00:04:37,844 --> 00:04:42,248 And if you tried to merge that data, you get really skewed information 69 00:04:42,248 --> 00:04:45,618 and not really know how to interpret the summary data. 70 00:04:46,052 --> 00:04:50,490 So, case report forms are what we typically collect the data on, 71 00:04:50,490 --> 00:04:53,059 either electronically -- in electronic case report 72 00:04:53,059 --> 00:04:57,096 forms, or in paper copies which we're seeing less of now. 73 00:04:57,096 --> 00:04:58,197 It's really important 74 00:04:58,197 --> 00:05:02,969 that they include the units as to how they're supposed to be measured. 75 00:05:02,969 --> 00:05:08,107 And if you're collecting it electronically, you can actually create limits in your system 76 00:05:08,107 --> 00:05:14,380 so that it won't allow you to put in, you know, an adult weight at ten pounds. 77 00:05:14,380 --> 00:05:16,582 That it just wouldn't allow you. 78 00:05:16,582 --> 00:05:19,519 You would have ranges as to what's acceptable. 79 00:05:21,421 --> 00:05:24,791 So, those are our elements that are very important, 80 00:05:24,791 --> 00:05:26,292 how you're measuring it. 81 00:05:26,292 --> 00:05:30,396 We do occasionally get investigators that wanted to rely on self-report 82 00:05:30,396 --> 00:05:34,534 for height and weight, which, if that's not a primary outcome, 83 00:05:34,534 --> 00:05:39,372 and it's just kind of a demographics information that you're collecting, it's fine. 84 00:05:39,372 --> 00:05:44,243 But if you're planning to use BMI and calculate BMI from that information 85 00:05:44,243 --> 00:05:49,482 and use it as a confounding variable and to adjust for in your analysis, 86 00:05:49,482 --> 00:05:53,786 you probably need to get accurate and measure height and weight. 87 00:05:53,786 --> 00:05:59,192 And certainly, if you're ever going to share that data for secondary data analysis, 88 00:05:59,192 --> 00:06:04,797 it would be important to label it as to whether it's measured height and weight, 89 00:06:04,797 --> 00:06:09,869 or whether it's self-reported height and weight so that the data couldn't be 90 00:06:09,869 --> 00:06:14,674 inappropriately used thinking that it was actually measured as opposed to self-report. 91 00:06:14,674 --> 00:06:15,875 And, you know, 92 00:06:15,875 --> 00:06:21,080 the measurement piece of this is -- the reproducibility thing comes up again. 93 00:06:21,080 --> 00:06:23,082 Yeah? Is there a question? 94 00:06:23,082 --> 00:06:25,885 Okay. Making sure that you are collecting 95 00:06:25,885 --> 00:06:30,690 the measure in the way that other people do it as well. 96 00:06:30,690 --> 00:06:34,293 So, I gave you the example of blood pressure. 97 00:06:34,827 --> 00:06:37,330 So, for most studies its done sitting. 98 00:06:37,330 --> 00:06:43,035 It's done after a period of rest when the person is not talking, is not eating. 99 00:06:43,035 --> 00:06:48,040 They get quite frustrated because they don't like to sit for -- typically, about 100 00:06:48,040 --> 00:06:53,045 five minutes and people's attention span, that is forever for people at this time. 101 00:06:53,045 --> 00:06:54,647 But that's the typical. 102 00:06:54,647 --> 00:06:59,452 If you need to do something for your study in a different way, 103 00:06:59,452 --> 00:07:04,457 if you actually want to get a reading on blood pressure when someone's exercising, 104 00:07:04,457 --> 00:07:07,326 it's really important that you label your variables 105 00:07:07,326 --> 00:07:13,766 that way and make it very clear as to well, how much exercise do they need to do 106 00:07:13,766 --> 00:07:18,471 and at what heart rate are you trying to get them to, and, 107 00:07:18,471 --> 00:07:23,576 you know, what blood pressure would you try to get for that? 108 00:07:23,576 --> 00:07:28,247 So, detailing all these aspects in the protocol is really important 109 00:07:28,247 --> 00:07:33,553 so your staff can collect the data that you want it collected. 110 00:07:33,553 --> 00:07:38,291 So, you've likely heard some issues around study instrument surveys and 111 00:07:38,558 --> 00:07:44,797 how to think about using them, the importance of making sure that they're actually reliable 112 00:07:44,797 --> 00:07:51,170 and valid the way that you're using them, and that they're actually sensitive to change. 113 00:07:51,270 --> 00:07:53,673 So, there are some measures, particularly 114 00:07:53,673 --> 00:07:58,511 in the psychology literature which we talk about as state or trait. 115 00:07:58,511 --> 00:08:02,114 So, state is the current state that you're in 116 00:08:02,114 --> 00:08:06,953 and you would expect that to fluctuate from day-to-day, time to time. 117 00:08:06,953 --> 00:08:12,191 How anxious are you giving a talk versus how anxious are you relaxing 118 00:08:12,191 --> 00:08:13,793 and getting a massage? 119 00:08:13,793 --> 00:08:16,662 Likely very different, depending on who you are. 120 00:08:16,662 --> 00:08:21,634 One may make you much more anxious than the other depending on who you are. 121 00:08:21,634 --> 00:08:25,638 Whereas trait variables are thought to be quite static and not change. 122 00:08:25,638 --> 00:08:29,175 If you choose an outcome measure and you're not very familiar 123 00:08:29,175 --> 00:08:33,679 with the different instruments and you happen to choose something that is a trait 124 00:08:33,679 --> 00:08:37,850 variable that's not going to change, then it's not a very good outcome 125 00:08:37,850 --> 00:08:42,355 measure to see if your intervention is going to have an influence on that 126 00:08:43,256 --> 00:08:47,527 if you really do think that trait variables and trait characteristics of individuals 127 00:08:47,527 --> 00:08:49,161 are something that are static. 128 00:08:49,161 --> 00:08:52,131 There certainly is a literature that's pushing to think 129 00:08:52,131 --> 00:08:55,067 that there are some interventions that can shift peoples 130 00:08:55,067 --> 00:08:57,370 trait tendencies in terms of their characteristics. 131 00:08:57,370 --> 00:08:59,005 But for the most part, 132 00:08:59,005 --> 00:09:00,339 people think about, say, 133 00:09:00,339 --> 00:09:00,973 that something 134 00:09:00,973 --> 00:09:05,912 that you would want to potentially measure as an outcome variable because it is variable 135 00:09:05,912 --> 00:09:10,516 versus a trait that is something that is just a characteristic about the person. 136 00:09:12,318 --> 00:09:13,352 So, those things 137 00:09:13,352 --> 00:09:17,523 are really important to think about as you're selecting your outcome measures. 138 00:09:17,523 --> 00:09:22,028 Over what time period do the things change, so some instruments are really 139 00:09:22,028 --> 00:09:26,165 only developed to be used every six months or once a year. 140 00:09:26,165 --> 00:09:30,336 Some of the stress characteristics -- the Perceived Stress Scale is one. 141 00:09:30,336 --> 00:09:35,174 Some of the items on there are, "Have you moved in the last year?" 142 00:09:36,075 --> 00:09:39,645 Well, that's not going to change very likely in six weeks. 143 00:09:39,645 --> 00:09:44,850 So, that item is going to be pretty static for most people on that particular scale. 144 00:09:44,850 --> 00:09:46,485 Have you lost your job? 145 00:09:46,485 --> 00:09:49,088 Have you had a death in the family? 146 00:09:49,088 --> 00:09:52,024 Those things are not something that on a day-to-day 147 00:09:52,024 --> 00:09:54,627 basis shift very much for a given individual. 148 00:09:54,627 --> 00:09:59,198 And so, a more daily stress measure might be something better if you're looking 149 00:09:59,198 --> 00:10:02,435 for changes in whether someone's stress levels changed over time. 150 00:10:06,872 --> 00:10:07,707 The other 151 00:10:07,707 --> 00:10:12,378 thing that we're seeing more and more of with outcome measures, 152 00:10:12,378 --> 00:10:17,883 most of the particularly survey instruments were developed as pen and paper instruments. 153 00:10:17,883 --> 00:10:23,389 And we see them more and more being shifted to being collected online, 154 00:10:23,389 --> 00:10:26,792 or being collected through EMA, Ecological Momentary Assessment. 155 00:10:26,792 --> 00:10:34,000 So, they ping you with a text and ask you a question or two in real time 156 00:10:34,667 --> 00:10:39,572 wherever you are to find out currently where you're at, how do you feel. 157 00:10:39,572 --> 00:10:45,478 And if the instrument wasn't validated to be used that way, and to be assessed that way, 158 00:10:45,478 --> 00:10:49,548 then it really needs to be validated to be used that way 159 00:10:49,548 --> 00:10:52,251 because there could be confusion around the language. 160 00:10:52,251 --> 00:10:56,322 There could be a lot of problems with how reliable it is 161 00:10:56,322 --> 00:11:00,693 and whether you get the same response every time you ask the question. 162 00:11:01,093 --> 00:11:06,666 And so, it's not that it can't be done, it's that someone -- before you use it 163 00:11:06,666 --> 00:11:08,300 as your primary outcome measure, 164 00:11:08,300 --> 00:11:11,937 make sure that you're choosing an instrument that has been validated 165 00:11:11,937 --> 00:11:16,509 and is found to be reliable in the way that you're actually using it. 166 00:11:18,511 --> 00:11:19,412 And this comes 167 00:11:19,412 --> 00:11:23,616 down to kind of is it a good measure for you, for your study? 168 00:11:23,616 --> 00:11:28,387 And as this says at the end here, there may not be published data on it, 169 00:11:28,387 --> 00:11:32,291 but that might be an opportunity for you to work with a psychometrician 170 00:11:32,291 --> 00:11:35,861 to do the validity study on delivering it in a different way. 171 00:11:36,429 --> 00:11:41,267 We need more valid and reliable measures through a variety of different 172 00:11:41,267 --> 00:11:43,669 delivery methods because patients and participants 173 00:11:43,669 --> 00:11:48,908 and research studies get very frustrated filling out hours and hours of forms. 174 00:11:48,908 --> 00:11:52,945 And you've likely heard some about the computer adaptive testing 175 00:11:52,945 --> 00:11:56,982 that is available through some of these instruments now where 176 00:11:56,982 --> 00:12:01,387 an individual takes an online -- does a set of questions 177 00:12:01,387 --> 00:12:07,026 and it calibrates to the individual, the way the GRE is done this way. 178 00:12:07,293 --> 00:12:12,031 So, it hones you into your score based on what your responses 179 00:12:12,031 --> 00:12:16,335 are rather than asking everyone the exact same set of questions. 180 00:12:16,335 --> 00:12:20,673 It's really customized to the individual who's responding to the questions. 181 00:12:20,673 --> 00:12:25,377 It can dramatically shorten the number of items that an individual has 182 00:12:25,377 --> 00:12:30,483 to answer to respond that way, but it's very foreign to an investigator. 183 00:12:30,483 --> 00:12:37,323 And I think they're getting more comfortable to this, but to know that not everyone answers 184 00:12:37,323 --> 00:12:42,061 the same questions is something that's very different than a 185 00:12:42,061 --> 00:12:46,665 typical standard set of a 27-item questionnaire for example. 186 00:12:46,665 --> 00:12:54,073 So, as I mentioned, these are the elements that you certainly want to include in your 187 00:12:54,540 --> 00:12:59,478 protocol in terms of the general design issues, your treatment assignment. 188 00:12:59,478 --> 00:13:00,913 The randomization procedure 189 00:13:00,913 --> 00:13:05,785 is something that's really important and we're starting to see, particularly 190 00:13:06,085 --> 00:13:11,724 at my center, more and more interventions that are delivered as group interventions. 191 00:13:11,724 --> 00:13:16,162 And it's very important to specify whether you are individually 192 00:13:16,162 --> 00:13:19,298 taking an individual that decides to participate 193 00:13:19,298 --> 00:13:25,538 and there's a random chance as to them participating in Class A versus Class 194 00:13:25,538 --> 00:13:31,811 B; or whether you've created a random allocation of classes and depending on 195 00:13:31,811 --> 00:13:38,217 when that participant happens to call, they get the next class that's available. 196 00:13:38,217 --> 00:13:40,686 And as you can imagine, 197 00:13:40,686 --> 00:13:45,624 if that randomization allocation sequence becomes known to study staff 198 00:13:45,624 --> 00:13:52,531 or to the people enrolling participants, it may influence who they decide to participate. 199 00:13:52,531 --> 00:14:00,439 So, if Treatment A is a surgical procedure and Treatment B is a watch and wait, 200 00:14:00,439 --> 00:14:05,077 they may hold off -- I mean, this is an extreme example. 201 00:14:05,077 --> 00:14:08,147 This is not likely to actually happen in a trial. 202 00:14:08,147 --> 00:14:11,817 But there certainly is published literature that if the staff is aware, 203 00:14:11,817 --> 00:14:14,887 they may even unconsciously not refer people to a study 204 00:14:14,887 --> 00:14:19,158 if they know that the next treatment allocation is for a watch and wait 205 00:14:19,158 --> 00:14:21,627 if they think that patient really needs treatment. 206 00:14:21,627 --> 00:14:23,929 They may actually not even refer patients. 207 00:14:23,929 --> 00:14:28,067 Whereas if they know the next treatment allocation is to the actual intervention, 208 00:14:28,067 --> 00:14:32,338 they're likely to refer a lot more patients and potentially much more severe patients 209 00:14:32,338 --> 00:14:36,008 that they think would respond or do well with that particular treatment. 210 00:14:36,508 --> 00:14:40,779 So, the blinding of the delivery and the sequence of class 211 00:14:40,779 --> 00:14:43,916 delivery in particular is something that's really important 212 00:14:43,916 --> 00:14:49,788 if you're doing a random class allocation, which we're starting to see more and more. 213 00:14:49,788 --> 00:14:51,724 And the same is true 214 00:14:51,724 --> 00:14:57,796 with other types of cluster designs and -- that need to be taken into account. 215 00:14:57,796 --> 00:15:02,668 The other issue that comes up that I don't think gets enough 216 00:15:02,668 --> 00:15:07,740 attention is the fact that when you randomize the participants in a study 217 00:15:07,740 --> 00:15:11,277 is really important in terms of your study flow. 218 00:15:12,945 --> 00:15:16,782 You know, one of the things I'm going to talk about 219 00:15:16,782 --> 00:15:22,021 and I'm sure one of the terms you've already heard is intention to treat analysis, 220 00:15:22,021 --> 00:15:24,123 which is everyone randomized gets analyzed. 221 00:15:24,123 --> 00:15:28,661 And because of that, you want to shorten the time and the wait 222 00:15:28,661 --> 00:15:33,232 period between when the randomization is done and when the participant actually starts 223 00:15:33,232 --> 00:15:34,233 the intervention. 224 00:15:34,233 --> 00:15:39,338 So, I've mentioned a few times that we often have these class delivered interventions. 225 00:15:39,338 --> 00:15:41,907 Mindfulness-based stress reduction is one of them, 226 00:15:41,907 --> 00:15:45,945 but this could be true for a group-based cognitive behavioral therapy, 227 00:15:45,945 --> 00:15:48,480 or other group-based visits for diabetes management. 228 00:15:48,480 --> 00:15:53,252 And if you have to wait until you get 15 or 20 participants 229 00:15:53,252 --> 00:15:58,023 before you can start the classes, if the first time someone calls you 230 00:15:58,023 --> 00:16:03,495 and you determine that they're eligible, you decide to randomize them and you say, "Okay. 231 00:16:03,495 --> 00:16:07,900 You've been randomized to this class and it happens on Tuesday night. 232 00:16:07,900 --> 00:16:13,038 And our first class will be on April 25th." And then you wait 233 00:16:13,038 --> 00:16:17,409 and you contact people, you're going to lose a lot of people. 234 00:16:17,676 --> 00:16:22,448 And so, you run the risk of having a lot of people drop out of your study 235 00:16:22,448 --> 00:16:26,685 after they've already been randomized and you now have to include them in your analysis. 236 00:16:26,685 --> 00:16:29,755 If instead you screen people and you let them know, "Okay. 237 00:16:29,755 --> 00:16:33,425 Our classes will be starting around the middle or the end of April. 238 00:16:33,425 --> 00:16:36,795 We'll call you again at the middle or the end of April 239 00:16:36,795 --> 00:16:39,031 to make sure you still want to participate 240 00:16:39,031 --> 00:16:41,834 and confirm some of the characteristics, that nothing has changed, 241 00:16:41,834 --> 00:16:43,869 and you're still eligible for the study. 242 00:16:43,869 --> 00:16:47,039 The study is planned -- your first class is April 25th," 243 00:16:47,039 --> 00:16:51,944 and you let them know the class will either be on this night, or on this night. 244 00:16:51,944 --> 00:16:55,981 "Please make sure your schedule is clear, so that we -- you could participate 245 00:16:55,981 --> 00:17:00,019 in the class." You call them, ideally, the day before if you can. 246 00:17:00,019 --> 00:17:04,623 You wait, really, as long as possible to be able to confirm that they're still eligible 247 00:17:04,623 --> 00:17:08,660 and then tell them which class they're going to go to after you confirm 248 00:17:08,660 --> 00:17:12,131 they can still attend the class because so many things come up. 249 00:17:12,498 --> 00:17:15,367 "Oh, my parents -- you know, I have a sick child 250 00:17:15,367 --> 00:17:18,737 that I've got to do all of these visits with," or "Oh, no. 251 00:17:18,737 --> 00:17:23,675 You know, my neighbor fell and broke their hip and now I have to help with care for them," 252 00:17:23,675 --> 00:17:26,812 or "My grandchild needs aftercare and my daughter got a new job. 253 00:17:26,812 --> 00:17:32,418 And so, I have to pick her up instead." So many things happen in life between the time 254 00:17:32,418 --> 00:17:37,056 someone decides to participate and potentially when a class might start for the class. 255 00:17:37,056 --> 00:17:39,024 And so, putting off that randomization 256 00:17:39,024 --> 00:17:43,328 until the last possible moment that's feasible for the study to pull off 257 00:17:43,328 --> 00:17:47,966 can save you tremendously in missing data in the long run for your analysis. 258 00:17:47,966 --> 00:17:52,237 So, that's one of the things that I certainly wanted to mention tonight. 259 00:17:55,574 --> 00:17:59,845 The other thing that I'll mention is that if you have interim analysis, 260 00:17:59,845 --> 00:18:03,482 you include those and these studies and your protocol as well, 261 00:18:03,482 --> 00:18:05,784 and hopefully you've had some information about, 262 00:18:05,784 --> 00:18:08,087 you know, really only doing interim analysis, 263 00:18:08,087 --> 00:18:11,690 and having stopping rules if it's absolutely necessary for your study, 264 00:18:11,690 --> 00:18:16,628 if there's a real reason to want to assess for futility in the study early. 265 00:18:17,096 --> 00:18:21,433 And certainly, your full analytic plan needs to be included as well. 266 00:18:21,433 --> 00:18:25,471 I think you guys have heard a lot about sample sizes 267 00:18:25,471 --> 00:18:29,641 and the importance of basing your sample size on your primary outcome 268 00:18:29,641 --> 00:18:32,778 and the primary hypothesis that you're going to test, 269 00:18:32,778 --> 00:18:36,615 and that you've got enough information in which to base that. 270 00:18:36,615 --> 00:18:39,418 And there's many different ways to analyze data. 271 00:18:39,718 --> 00:18:43,622 My take home message as an investigator and someone giving advice to investigators is 272 00:18:43,622 --> 00:18:46,125 have a really good biostatistician that you work with 273 00:18:46,125 --> 00:18:48,627 from the beginning all the way through your study. 274 00:18:48,627 --> 00:18:53,365 And don't just hand them your data at the end because it's likely they won't be able 275 00:18:53,365 --> 00:18:57,002 to do the analysis that you hoped they would be able to do. 276 00:18:57,803 --> 00:19:02,207 You want to involve them from the very beginning to make sure you're 277 00:19:02,207 --> 00:19:04,910 designing your study to actually answer your hypothesis, 278 00:19:04,910 --> 00:19:09,314 and you're collecting the data that they need to look at the outcomes, 279 00:19:09,314 --> 00:19:13,685 and control for any potential confounders that may be measured in your study. 280 00:19:16,255 --> 00:19:20,159 Again, I'm not going to go into a lot of the details 281 00:19:20,159 --> 00:19:25,397 although I will emphasize the point on the analytic plan, that you do need to have 282 00:19:25,397 --> 00:19:30,335 a very clear process laid out as to how you're going to handle missing data, 283 00:19:30,335 --> 00:19:33,906 what you're going to do about it, how your statistician -- 284 00:19:33,906 --> 00:19:37,509 are they going to try to impute based on other data? 285 00:19:37,509 --> 00:19:42,447 What is the method that they're going to deal with the issue of missing data? 286 00:19:42,447 --> 00:19:47,019 And how are they going to evaluate whether it's missing at random, or whether 287 00:19:47,019 --> 00:19:48,487 it's actually informative missing-ness? 288 00:19:48,487 --> 00:19:50,122 And that 289 00:19:50,856 --> 00:19:53,825 you had differential drop out, so your wait list control 290 00:19:53,825 --> 00:19:58,597 that you thought would be great -- well, nobody came back for their follow up visits. 291 00:19:58,597 --> 00:20:03,035 And so, you have very little follow up data on your wait list comparison group. 292 00:20:03,035 --> 00:20:07,472 It's -- it can be a definite problem in studies that have very different types 293 00:20:07,472 --> 00:20:10,442 of interventions for participants and different engagement of the participants. 294 00:20:12,044 --> 00:20:15,781 I mentioned the interim analysis and the biggest thing here is 295 00:20:16,014 --> 00:20:21,687 you want to be able to analyze your data, so that it will have an impact. 296 00:20:21,687 --> 00:20:26,425 And so, all of these elements of including these pieces in the protocol 297 00:20:26,425 --> 00:20:31,663 ultimately get you to the place where you have good quality data that's been collected 298 00:20:31,663 --> 00:20:34,833 consistently across all participants that you've included, only participants 299 00:20:34,833 --> 00:20:39,371 who are eligible for your study, and that if everyone followed the protocol 300 00:20:39,371 --> 00:20:43,575 all the way through, you have very little to no missing data. 301 00:20:43,842 --> 00:20:48,313 So, you have a great quality data set on which to base your analytics 302 00:20:48,313 --> 00:20:50,215 and then conclusions of your study. 303 00:20:50,215 --> 00:20:51,483 The more missing data 304 00:20:51,483 --> 00:20:55,954 you have, the more difficult it is to interpret the results of your study 305 00:20:55,954 --> 00:20:58,490 and making it -- make really conclusive statements 306 00:20:58,490 --> 00:21:01,660 about whether or not your intervention was beneficial or not. 307 00:21:04,196 --> 00:21:04,930 This, I'll 308 00:21:04,930 --> 00:21:09,034 mention briefly just because it's this concept of if they're randomized, 309 00:21:09,034 --> 00:21:14,273 it is someone that you need to include in your intent to treat analysis. 310 00:21:14,273 --> 00:21:16,541 I'll refer you to your biostatisticians 311 00:21:16,541 --> 00:21:21,413 about whether or not it's wise to do modified intent to treat analysis. 312 00:21:21,413 --> 00:21:24,383 There's a lot of differing opinion about that. 313 00:21:24,383 --> 00:21:30,389 Certainly, we do see per protocol analysis to see -- and often times, these are explored 314 00:21:30,389 --> 00:21:35,227 as different types of sensitivity analysis to see whether there's a huge impact 315 00:21:35,227 --> 00:21:39,731 of whether or not people are compliant or adherent to their protocol 316 00:21:39,731 --> 00:21:43,101 in terms of their benefit with the treatment itself.