1 00:00:04,104 --> 00:00:06,073 BY DEVELOPING 2 00:00:06,073 --> 00:00:06,673 THE FIRST 3 00:00:06,673 --> 00:00:08,175 CELL FREE SYSTEM 4 00:00:08,175 --> 00:00:10,410 TO SUPPORT A VERTEBRATE CHROMOSOME. 5 00:00:10,410 --> 00:00:13,413 DNA REPLICATION USING FROG EGG EXTRACTS. 6 00:00:13,780 --> 00:00:15,015 AND THIS WORK TRANSFORMED 7 00:00:15,015 --> 00:00:17,618 THE STUDY OF DNA REPLICATION. 8 00:00:17,618 --> 00:00:20,687 IN 1999, DOCTOR WALTER JOINED HARVARD 9 00:00:20,687 --> 00:00:22,656 MEDICAL SCHOOL AS AN ASSISTANT PROFESSOR 10 00:00:22,656 --> 00:00:23,490 AND WAS LATER 11 00:00:23,490 --> 00:00:26,526 PROMOTED TO FULL PROFESSOR IN 2010. 12 00:00:26,526 --> 00:00:30,130 AND SINCE 2013, HE HAS ALSO BEEN 13 00:00:30,130 --> 00:00:31,632 AN INVESTIGATOR AT THE HARVARD 14 00:00:31,632 --> 00:00:33,200 HUGHES MEDICAL INSTITUTE 15 00:00:33,200 --> 00:00:34,768 AND DOCTOR WALTER REMAIN 16 00:00:34,768 --> 00:00:36,703 AS A DEDICATED EDUCATOR, 17 00:00:36,703 --> 00:00:38,105 TEACHING DNA REPLICATION 18 00:00:38,105 --> 00:00:39,473 AND REPAIR IN GRADUATE 19 00:00:39,473 --> 00:00:42,209 GRADUATE COURSES AT HARVARD. 20 00:00:42,209 --> 00:00:44,611 WE'RE THRILLED TO HAVE HIM HERE TODAY 21 00:00:44,611 --> 00:00:45,078 TO SHARE. 22 00:00:45,078 --> 00:00:47,047 HIS INSIGHTS AND EXPERTISE 23 00:00:47,047 --> 00:00:47,981 FOR THE AUDIENCE. 24 00:00:47,981 --> 00:00:48,482 SO PLEASE 25 00:00:48,482 --> 00:00:50,617 PUT ANY QUESTIONS IN THE CHAT BOX 26 00:00:50,617 --> 00:00:51,852 AND WE WILL GO THROUGH THEM 27 00:00:51,852 --> 00:00:53,420 AT THE END OF THE SEMINAR. 28 00:00:53,420 --> 00:00:55,422 AND PLEASE JOIN ME IN WELCOMING 29 00:00:55,422 --> 00:00:56,857 DOCTOR JOHANNES. WATER. 30 00:00:59,593 --> 00:01:00,827 WELL, THANK YOU VERY MUCH, CHUN, 31 00:01:00,827 --> 00:01:02,496 FOR THE VERY NICE INTRODUCTION. 32 00:01:02,496 --> 00:01:05,432 CAN EVERYONE HEAR ME? 33 00:01:05,432 --> 00:01:06,533 YES WE CAN. 34 00:01:06,533 --> 00:01:09,469 OKAY, GREAT. 35 00:01:09,469 --> 00:01:09,870 SO. YEAH. 36 00:01:09,870 --> 00:01:11,271 SO IT'S A REAL PLEASURE AND HONOR 37 00:01:11,271 --> 00:01:14,741 TO BE ABLE TO GIVE THIS TALK IN 38 00:01:15,142 --> 00:01:17,244 WHAT I GUESS IS THE ORIGINAL, 39 00:01:17,244 --> 00:01:20,080 GENOME MAINTENANCE WEBINAR. 40 00:01:20,080 --> 00:01:23,750 AND, I'M VERY EXCITED TO TELL YOU TODAY 41 00:01:23,750 --> 00:01:26,086 ABOUT HOW WE HAVE TACKLED 42 00:01:26,086 --> 00:01:27,587 A REALLY CLASSIC PROBLEM, 43 00:01:27,587 --> 00:01:29,957 THE PROBLEM OF HOW TRANSCRIPTION 44 00:01:29,957 --> 00:01:32,526 IS COUPLED TO REPAIR 45 00:01:32,526 --> 00:01:35,562 USING A FEW NEW APPROACHES. 46 00:01:35,963 --> 00:01:38,565 ONE OF THESE IS A CELL FREE SYSTEM 47 00:01:38,565 --> 00:01:41,568 THAT WE HAVE DEVELOPED THAT SUPPORTS TC-NER 48 00:01:42,035 --> 00:01:45,038 AND ANOTHER IS USING 49 00:01:45,205 --> 00:01:46,506 AN IN SILICO SCREENING 50 00:01:46,506 --> 00:01:46,907 METHOD 51 00:01:46,907 --> 00:01:48,108 TO LOOK FOR NEW PROTEIN 52 00:01:48,108 --> 00:01:50,610 PROTEIN INTERACTIONS. 53 00:01:50,610 --> 00:01:52,446 AND BY COMBINING THESE APPROACHES, 54 00:01:52,446 --> 00:01:53,613 WE THINK WE'VE MADE, 55 00:01:53,613 --> 00:01:56,450 AN IMPORTANT NEW OBSERVATION, 56 00:01:56,450 --> 00:01:58,185 WHICH IS REALLY, 57 00:01:58,185 --> 00:01:59,987 IDENTIFYING THE ROLE OF A PROTEIN CALLED 58 00:01:59,987 --> 00:02:02,022 STK19 AND COUPLING, 59 00:02:02,022 --> 00:02:04,691 TRANSCRIPTION STALLING TO THE DOWNSTREAM 60 00:02:04,691 --> 00:02:05,726 REPAIR EVENTS. 61 00:02:07,194 --> 00:02:07,828 SO I'M GOING TO 62 00:02:07,828 --> 00:02:10,831 BEGIN JUST WITH THE 30,000FT VIEW, 63 00:02:11,231 --> 00:02:13,934 WHICH IS TO SAY THAT COLLECTIVELY, 64 00:02:13,934 --> 00:02:15,335 AS A FIELD, 65 00:02:15,335 --> 00:02:16,837 WE ARE ALL TRYING TO UNDERSTAND, 66 00:02:16,837 --> 00:02:20,374 THE HIGHLY VARIED DNA REPAIR PATHWAYS 67 00:02:20,807 --> 00:02:25,278 THAT RESPOND TO AND FIX THE VERY DIVERSE 68 00:02:25,278 --> 00:02:27,547 LESIONS THAT ARE GENERATED BY EXOGENOUS 69 00:02:27,547 --> 00:02:30,217 AND ENDOGENOUS AGENTS. 70 00:02:30,217 --> 00:02:32,219 AND WE'RE ALSO, OF COURSE, 71 00:02:32,219 --> 00:02:33,420 TRYING TO UNDERSTAND 72 00:02:33,420 --> 00:02:35,956 HOW DEFECTS IN THESE PATHWAYS 73 00:02:35,956 --> 00:02:38,291 GIVE RISE TO THE VARIOUS DISEASES 74 00:02:38,291 --> 00:02:39,626 THAT WE SEE IN HUMANS, 75 00:02:39,626 --> 00:02:42,629 NOT ONLY, TO UNDERSTAND REALLY 76 00:02:42,629 --> 00:02:43,830 THE THE BASIS 77 00:02:43,830 --> 00:02:46,500 OF THESE PATHOLOGICAL STATES, BUT ALSO, 78 00:02:46,500 --> 00:02:48,101 IN SOME CASES TO ACTUALLY SEE 79 00:02:48,101 --> 00:02:49,469 IF WE CAN FIND 80 00:02:49,469 --> 00:02:52,873 SOME, SOME THERAPEUTIC INTERVENTIONS 81 00:02:53,206 --> 00:02:54,741 TO, TO AMELIORATE 82 00:02:54,741 --> 00:02:57,277 SOME OF THESE CONDITIONS. 83 00:02:57,277 --> 00:02:59,312 NOW, MY LABORATORY FOR MANY YEARS 84 00:02:59,312 --> 00:03:00,047 HAS FOCUSED 85 00:03:00,047 --> 00:03:01,681 ON REPLICATION 86 00:03:01,681 --> 00:03:03,950 COUPLED REPAIR OF VERY COMPLEX 87 00:03:03,950 --> 00:03:05,752 LESIONS LIKE DNA INTERSTRAND 88 00:03:05,752 --> 00:03:08,288 CROSS LINKS AND DNA PROTEIN CROSS LINKS. 89 00:03:08,288 --> 00:03:10,424 AND THAT INTEREST CONTINUES. 90 00:03:10,424 --> 00:03:11,691 BUT WHAT I'M GOING TO FOCUS ON 91 00:03:11,691 --> 00:03:14,361 TODAY IS A RELATIVELY NEW INTEREST 92 00:03:14,361 --> 00:03:15,629 IN THE MECHANISM 93 00:03:15,629 --> 00:03:16,997 OF NUCLEOTIDE 94 00:03:16,997 --> 00:03:19,599 EXCISION REPAIR, IN PARTICULAR, 95 00:03:19,599 --> 00:03:20,500 THE TRANSCRIPTION 96 00:03:20,500 --> 00:03:23,170 COUPLED BRANCH OF THIS PATHWAY. 97 00:03:23,170 --> 00:03:24,504 SO LET ME BEGIN 98 00:03:24,504 --> 00:03:25,639 BY GIVING YOU A LITTLE BIT OF 99 00:03:25,639 --> 00:03:27,274 BACKGROUND ABOUT WHAT WE KNOW 100 00:03:28,408 --> 00:03:30,110 ABOUT NUCLEOTIDE EXCISION REPAIR. 101 00:03:30,110 --> 00:03:32,746 THIS IS A PATHWAY THAT REALLY SPECIALIZES 102 00:03:32,746 --> 00:03:35,982 IN THE REPAIR OF VERY BULKY 103 00:03:35,982 --> 00:03:39,319 DNA LESIONS THAT DISTORT THE DNA DUPLEX. 104 00:03:39,319 --> 00:03:40,854 THESE CAN BE GENERATED 105 00:03:40,854 --> 00:03:43,623 BY UV LIGHT, CIGARETTE SMOKE, 106 00:03:43,623 --> 00:03:47,494 AND ALSO VARIOUS ENDOGENOUS ALDEHYDES 107 00:03:48,728 --> 00:03:50,597 AND THESE AGENTS 108 00:03:50,597 --> 00:03:51,965 LEAD TO THE FORMATION 109 00:03:51,965 --> 00:03:53,867 OF THESE BULKY ADDUCT 110 00:03:53,867 --> 00:03:55,502 THAT HAVE THE PROPERTY 111 00:03:55,502 --> 00:03:58,505 OF ACTUALLY DISTORTING THE DNA DUPLEX. 112 00:03:58,505 --> 00:04:01,007 AND IN ONE BRANCH OF NUCLEOTIDE EXCISION 113 00:04:01,007 --> 00:04:02,242 REPAIR, CALLED THE GLOBAL 114 00:04:02,242 --> 00:04:03,176 GENOMIC PATHWAY, 115 00:04:03,176 --> 00:04:03,877 WHICH CAN 116 00:04:03,877 --> 00:04:05,078 IN PRINCIPLE FUNCTION 117 00:04:05,078 --> 00:04:07,581 ANYWHERE IN THE GENOME. 118 00:04:07,581 --> 00:04:09,883 THE DISTORTION IN THE DNA DUPLEX 119 00:04:09,883 --> 00:04:12,853 IS RECOGNIZED BY A PROTEIN CALLED XPC 120 00:04:12,853 --> 00:04:14,988 XERODERMA PIGMENTOSUM 121 00:04:14,988 --> 00:04:16,456 GROUP 122 00:04:16,456 --> 00:04:17,324 C PROTEIN 123 00:04:17,324 --> 00:04:20,393 THAT IS, THAT IS MUTATED IN THIS, 124 00:04:20,393 --> 00:04:23,430 SKIN CANCER PREDISPOSITION SYNDROME. 125 00:04:23,430 --> 00:04:26,466 AND WHAT IT DOES IS REALLY QUITE CLEVER. 126 00:04:26,466 --> 00:04:30,370 IT ACTUALLY SENSES NOT THE LESION ITSELF, 127 00:04:30,370 --> 00:04:32,172 BUT THE DISTORTION IN THE DUPLEX. 128 00:04:32,172 --> 00:04:33,573 AND IT ACTUALLY GRABS 129 00:04:33,573 --> 00:04:36,042 HOLD OF THE NON-DAMAGED STRAND 130 00:04:36,977 --> 00:04:39,479 BY VIRTUE OF THE DESTABILIZATION 131 00:04:39,479 --> 00:04:40,413 OF THE DUPLEX 132 00:04:40,413 --> 00:04:43,416 THAT OCCURS AS A RESULT OF THE DAMAGE. 133 00:04:43,550 --> 00:04:47,187 AND XPC THEN FUNCTIONS TO RECRUIT 134 00:04:47,587 --> 00:04:50,590 THE GENERAL TRANSCRIPTION FACTOR TFIIH 135 00:04:51,091 --> 00:04:53,894 WHICH 136 00:04:53,894 --> 00:04:56,730 WHICH HAS A KEY ROLE 137 00:04:56,730 --> 00:04:59,199 IN INITIATING TRANSCRIPTION. 138 00:04:59,199 --> 00:05:00,934 SO IT ACTUALLY PROMOTES 139 00:05:00,934 --> 00:05:02,068 THE PROMOTER 140 00:05:02,068 --> 00:05:05,372 OPENING AT AT TRANSCRIPTION START SITES. 141 00:05:06,606 --> 00:05:09,142 AND IT HAS A PARTIALLY OVERLAPPING 142 00:05:09,142 --> 00:05:11,278 FUNCTION IN REPAIR. 143 00:05:11,278 --> 00:05:14,481 SO AFTER BEING RECRUITED BY XPC, 144 00:05:15,215 --> 00:05:18,852 THE XPB SUBUNIT OF TFIIH 145 00:05:19,219 --> 00:05:21,221 THIS IS A TRANSLOCASE. 146 00:05:21,221 --> 00:05:21,922 IT'S AN ATPASE 147 00:05:21,922 --> 00:05:24,558 IS THAT TRANSLOCATES ALONG DNA 148 00:05:24,558 --> 00:05:25,792 AND IT ACTUALLY PUMPS 149 00:05:25,792 --> 00:05:30,030 THE DUPLEX TOWARDS THE XPC PROTEIN. 150 00:05:30,030 --> 00:05:31,631 AND IN DOING SO, 151 00:05:31,631 --> 00:05:33,934 IT ROTATES AND UNDERWINDS 152 00:05:33,934 --> 00:05:35,168 THE DNA THAT LEADS 153 00:05:35,168 --> 00:05:37,370 TO LOCAL STRAND SEPARATION. 154 00:05:37,370 --> 00:05:38,738 AND NOW A SECOND 155 00:05:38,738 --> 00:05:41,741 SUBUNIT IN TFIIH, CALLED XPD, 156 00:05:41,908 --> 00:05:43,076 WHICH IS A FIVE PRIME 157 00:05:43,076 --> 00:05:44,578 TO THREE PRIME HELICASE, 158 00:05:45,612 --> 00:05:48,014 GRABS HOLD OF ONE STRAND 159 00:05:48,014 --> 00:05:49,282 AND THEN STARTS TRANS 160 00:05:49,282 --> 00:05:52,619 LOCATING TOWARDS THE SITE OF THE DAMAGE, 161 00:05:53,153 --> 00:05:55,388 AND IF DAMAGE IS PRESENT, 162 00:05:55,388 --> 00:05:58,225 THEN XPD GETS STUCK. 163 00:05:58,225 --> 00:06:01,595 AND THIS IS A PROCESS THAT IS CALLED 164 00:06:01,728 --> 00:06:02,996 LESION VERIFICATION 165 00:06:02,996 --> 00:06:04,698 OR DAMAGE VERIFICATION. 166 00:06:04,698 --> 00:06:07,167 AND THE STALLING OF XPD 167 00:06:07,167 --> 00:06:10,170 THEN NUCLEATES THE DOWNSTREAM EVENTS, 168 00:06:10,170 --> 00:06:11,104 WHICH INCLUDES 169 00:06:11,104 --> 00:06:13,006 THE RECRUITMENT OF STRUCTURE 170 00:06:13,006 --> 00:06:14,975 SPECIFIC ENDONUCLEASES XPF 171 00:06:14,975 --> 00:06:15,976 AND XPG 172 00:06:15,976 --> 00:06:19,212 THAT INCIZE THE DAMAGED STRAND AND, 173 00:06:19,412 --> 00:06:21,414 REMOVE A SHORT OLIGONUCLEOTIDE 174 00:06:21,414 --> 00:06:23,083 CONTAINING THE LESION. 175 00:06:23,083 --> 00:06:25,585 AND THEN THE FINAL EVENT 176 00:06:25,585 --> 00:06:28,722 IS A GAP FILLING STEP. 177 00:06:30,056 --> 00:06:32,259 SO THIS IS THE SO-CALLED GLOBAL 178 00:06:32,259 --> 00:06:33,159 GENOMIC PATHWAY, 179 00:06:33,159 --> 00:06:34,027 WHICH, AS I SAID, 180 00:06:34,027 --> 00:06:35,095 CAN FUNCTION, 181 00:06:35,095 --> 00:06:38,131 ESSENTIALLY ANYWHERE IN THE GENOME. 182 00:06:38,131 --> 00:06:39,032 BUT THERE'S A SECOND 183 00:06:39,032 --> 00:06:40,634 VERY IMPORTANT BRANCH 184 00:06:40,634 --> 00:06:41,868 OF NUCLEOTIDE EXCISION 185 00:06:41,868 --> 00:06:43,270 REPAIR CALLED TRANSCRIPTION 186 00:06:43,270 --> 00:06:44,070 COUPLED REPAIR. 187 00:06:44,070 --> 00:06:47,641 AND THIS COMES INTO PLAY WHEN THE GLOBAL 188 00:06:47,641 --> 00:06:50,644 GENOMIC PATHWAY FAILS TO REMOVE A LESION 189 00:06:50,844 --> 00:06:51,911 AND AN RNA 190 00:06:51,911 --> 00:06:54,914 POLYMERASE COLLIDES WITH THE DAMAGE. 191 00:06:55,015 --> 00:06:58,151 NOW, THIS WOULD BE A TRULY DIRE SITUATION 192 00:06:58,151 --> 00:06:59,452 BECAUSE, 193 00:06:59,452 --> 00:07:01,354 THE LESION STALLS THE POLYMERASE 194 00:07:01,354 --> 00:07:03,256 AND LOCALLY DISRUPTS, 195 00:07:03,256 --> 00:07:05,225 GENE EXPRESSION ON THE ONE HAND 196 00:07:05,225 --> 00:07:08,261 AND ON THE OTHER HAND, THE THE POLYMERASE 197 00:07:08,261 --> 00:07:09,763 ACTUALLY SHIELDS THE LESION 198 00:07:09,763 --> 00:07:11,564 FROM THE GLOBAL GENOMIC PATHWAY. 199 00:07:11,564 --> 00:07:12,232 SO THIS WOULD BE, 200 00:07:13,266 --> 00:07:14,534 REALLY CATASTROPHIC 201 00:07:14,534 --> 00:07:15,135 IF NOT 202 00:07:15,135 --> 00:07:16,369 FOR THE FACT 203 00:07:16,369 --> 00:07:19,572 THAT THE STALLED RNA POLYMERASE ITSELF, 204 00:07:19,839 --> 00:07:21,074 LIKE XPC, 205 00:07:21,074 --> 00:07:22,142 CAN ACTUALLY FUNCTION 206 00:07:22,142 --> 00:07:26,780 AS A SENSOR FOR THE REPAIR PATHWAY. 207 00:07:26,780 --> 00:07:28,782 AND THIS WAS THE INSIGHT OF PHIL HANAWALT 208 00:07:28,782 --> 00:07:31,151 IN THE MID 1980S. 209 00:07:31,151 --> 00:07:33,320 AND SO, WE NOW KNOW 210 00:07:33,320 --> 00:07:34,788 FROM THE WORK 211 00:07:34,788 --> 00:07:36,589 OF MANY PEOPLE IN THE FIELD, 212 00:07:36,589 --> 00:07:38,992 THE SORT OF OUTLINES OF THIS TC-NER 213 00:07:38,992 --> 00:07:39,993 THIS TRANSCRIPTION 214 00:07:39,993 --> 00:07:41,561 COUPLED NUCLEOTIDE EXCISION 215 00:07:41,561 --> 00:07:43,029 REPAIR PATHWAY. 216 00:07:43,029 --> 00:07:46,032 AND REALLY THE FIRST EVENT HERE 217 00:07:46,099 --> 00:07:47,233 IS THE RECRUITMENT 218 00:07:47,233 --> 00:07:50,136 OF AN ATPASE CALLED CSB. 219 00:07:50,136 --> 00:07:53,673 AND THIS PROTEIN IS MUTATED IN 220 00:07:53,673 --> 00:07:54,908 ANOTHER HUMAN DISORDER 221 00:07:54,908 --> 00:07:56,343 CALLED COCKAYNE SYNDROME, 222 00:07:56,343 --> 00:07:58,211 WHICH IS CHARACTERIZED BY, 223 00:07:58,211 --> 00:08:01,214 NEURODEGENERATION AND PROGERIA. 224 00:08:02,148 --> 00:08:03,683 AND WHAT CSB DOES 225 00:08:03,683 --> 00:08:05,652 IS IT ACTUALLY PUSHES ON THE BACK 226 00:08:05,652 --> 00:08:07,721 END OF RNA POLYMERASE 227 00:08:07,721 --> 00:08:09,856 AND TRIES TO MOVE IT PAST OBSTACLES. 228 00:08:09,856 --> 00:08:11,791 AND IF THAT FAILS, AS IN THE CASE OF 229 00:08:11,791 --> 00:08:15,462 VERY BULKY DNA LESIONS, CSB 230 00:08:15,462 --> 00:08:18,732 THEN NUCLEATES THE FORMATION OF A LARGE 231 00:08:18,865 --> 00:08:21,301 SO-CALLED TC-NER COMPLEX. 232 00:08:21,301 --> 00:08:24,070 AND SO THE NEXT PROTEIN TO LOAD IN 233 00:08:24,070 --> 00:08:25,538 THIS SEQUENCE IS AN E3 234 00:08:25,538 --> 00:08:28,541 UBIQUITIN LIGASE CALLED CRL4CSA, 235 00:08:28,875 --> 00:08:29,943 WHOSE KEY 236 00:08:29,943 --> 00:08:33,146 RECOGNITION SUBUNIT IS CALLED CSA. 237 00:08:33,279 --> 00:08:35,348 AND THAT'S ANOTHER ONE OF THE PROTEINS 238 00:08:35,348 --> 00:08:37,150 MUTATED IN THE COCKAYNE SYNDROME. 239 00:08:38,318 --> 00:08:39,753 AND CRL4 240 00:08:39,753 --> 00:08:41,287 DOES A FEW THINGS, 241 00:08:41,287 --> 00:08:43,289 ONE OF WHICH IS THAT IT ACTUALLY 242 00:08:43,289 --> 00:08:44,124 UBIQUITYLATES 243 00:08:44,124 --> 00:08:47,127 RNA POLYMERASE ON A SINGLE RESIDUE. 244 00:08:47,494 --> 00:08:48,828 AND THAT UBIQUITINATION 245 00:08:48,828 --> 00:08:51,097 EVENT IS CRUCIAL FOR REPAIR. 246 00:08:51,097 --> 00:08:52,632 ALTHOUGH WE DON'T REALLY UNDERSTAND 247 00:08:52,632 --> 00:08:55,301 THE MECHANISTIC BASIS FOR THAT. 248 00:08:55,301 --> 00:08:59,038 ANOTHER THING THAT THIS E3 LIGASE DOES 249 00:08:59,038 --> 00:09:00,673 IS TOGETHER WITH CSB, IT 250 00:09:00,673 --> 00:09:03,309 RECRUITS ADDITIONAL FACTORS, 251 00:09:03,309 --> 00:09:05,979 INCLUDING THE GIVES UVSSA PROTEIN 252 00:09:05,979 --> 00:09:07,847 TO COMPLETE THE FORMATION 253 00:09:07,847 --> 00:09:10,850 OF THIS SO-CALLED TC-NER COMPLEX. 254 00:09:11,618 --> 00:09:13,686 AND I'M GOING TO DENOTE IT HERE. 255 00:09:13,686 --> 00:09:14,387 I'M NOT GOING TO, 256 00:09:14,387 --> 00:09:18,158 CONTINUALLY DRAW THE ENTIRE E3 LIGASE. 257 00:09:18,958 --> 00:09:19,859 AND WE ACTUALLY HAVE 258 00:09:19,859 --> 00:09:21,027 A PRETTY GOOD UNDERSTANDING 259 00:09:21,027 --> 00:09:22,362 STRUCTURALLY OF WHAT 260 00:09:22,362 --> 00:09:25,498 THIS TC-NER COMPLEX LOOKS LIKE, 261 00:09:25,498 --> 00:09:27,100 LOOKS LIKE BASED ON BEAUTIFUL WORK 262 00:09:27,100 --> 00:09:29,502 FROM PATRICK KRAMER'S LABORATORY. 263 00:09:29,502 --> 00:09:32,672 AND REALLY, THE FUNCTION OF THIS COMPLEX, 264 00:09:32,672 --> 00:09:34,841 ULTIMATELY, IS TO RECRUIT 265 00:09:34,841 --> 00:09:37,410 THE TFIIH PROTEIN. 266 00:09:37,410 --> 00:09:40,180 AND AS WE SAW IN GLOBAL GENOMIC REPAIR 267 00:09:40,180 --> 00:09:42,048 AND TRANSCRIPTION COUPLED REPAIR, 268 00:09:42,048 --> 00:09:45,051 WHAT IT DOES IS IT LOCALLY OPENS THE DNA. 269 00:09:45,051 --> 00:09:47,954 THAT'S THE FUNCTION AGAIN OF XPB. 270 00:09:47,954 --> 00:09:49,422 AND THAT ALLOWS XPD 271 00:09:49,422 --> 00:09:52,525 TO ENGAGE WITH THE TRANSCRIBED STRAND 272 00:09:52,926 --> 00:09:55,628 AND XPD THEN TRANSLOCATE TOWARDS 273 00:09:55,628 --> 00:09:57,163 THE STALLED POLYMERASE 274 00:09:57,163 --> 00:09:58,665 LOOKING FOR DAMAGE. 275 00:09:58,665 --> 00:09:59,399 AND AGAIN, 276 00:09:59,399 --> 00:10:01,334 IF DAMAGE IS SENSED, 277 00:10:01,334 --> 00:10:02,602 THE WHOLE TFIIH 278 00:10:02,602 --> 00:10:04,337 COMPLEX STALLS AND RECRUITS 279 00:10:04,337 --> 00:10:05,205 THE NUCLEASES 280 00:10:05,205 --> 00:10:05,972 THAT THEN 281 00:10:05,972 --> 00:10:09,075 EXCISED THE LESION AND ALLOW GAP FILLING. 282 00:10:09,742 --> 00:10:13,079 A SECOND POSSIBLE FUNCTION OF THIS XPD 283 00:10:13,213 --> 00:10:16,850 DEPENDENT TRANSLOCATION IS THAT, 284 00:10:16,850 --> 00:10:18,251 IT PROBABLY. 285 00:10:18,251 --> 00:10:19,185 ALTHOUGH THIS HAS NEVER BEEN 286 00:10:19,185 --> 00:10:20,453 DIRECTLY OBSERVED, IT 287 00:10:20,453 --> 00:10:22,722 PROBABLY BACKTRACKS RNA POLYMERASE 288 00:10:22,722 --> 00:10:24,591 TO EXPOSE THE LESION 289 00:10:24,591 --> 00:10:26,459 AND EFFECTIVELY ALLOW 290 00:10:26,459 --> 00:10:28,127 THE DOWNSTREAM EVENTS. 291 00:10:29,162 --> 00:10:30,797 SO YOU CAN REALLY THINK OF, 292 00:10:30,797 --> 00:10:32,565 BOTH OF THESE PATHWAYS, 293 00:10:32,565 --> 00:10:33,433 BOTH BRANCHES, 294 00:10:33,433 --> 00:10:35,535 AS OCCURRING IN THREE BASIC STEPS. 295 00:10:35,535 --> 00:10:36,870 THE SENSING MECHANISM, 296 00:10:36,870 --> 00:10:39,205 WHICH IN THE CASE OF, 297 00:10:39,205 --> 00:10:40,406 GLOBAL GENOMIC REPAIR 298 00:10:40,406 --> 00:10:42,242 REALLY RELIES ON XPC 299 00:10:42,242 --> 00:10:43,243 AND IN THE CASE 300 00:10:43,243 --> 00:10:44,944 OF TRANSCRIPTION COUPLED REPAIR, 301 00:10:44,944 --> 00:10:46,980 REALLY RELIES ON RNA POLYMERASE. 302 00:10:46,980 --> 00:10:48,882 THEN THE VERIFICATION STEP, 303 00:10:48,882 --> 00:10:49,782 WHICH IS COMMON 304 00:10:49,782 --> 00:10:51,851 BETWEEN THE TWO AND DEPENDS ON TFIIH. 305 00:10:51,851 --> 00:10:52,752 AND THEN FINALLY ALSO 306 00:10:52,752 --> 00:10:53,520 THE COMMON 307 00:10:53,520 --> 00:10:56,523 EXCISION AND GAP FILLING EVENTS. 308 00:10:57,190 --> 00:11:00,527 SO, I HOPE THIS GIVES YOU, 309 00:11:00,527 --> 00:11:02,629 A PICTURE 310 00:11:02,629 --> 00:11:04,764 ABOUT WHERE WE ARE IN THE FIELD, 311 00:11:04,764 --> 00:11:06,266 BUT I WOULD LIKE TO POINT OUT 312 00:11:06,266 --> 00:11:07,700 THAT THERE ARE 313 00:11:07,700 --> 00:11:08,968 LOTS OF UNANSWERED 314 00:11:08,968 --> 00:11:09,769 QUESTIONS, 315 00:11:09,769 --> 00:11:12,772 PARTICULARLY ON THE TRANSCRIPTION COUPLED 316 00:11:13,006 --> 00:11:13,873 REPAIR SIDE. 317 00:11:13,873 --> 00:11:15,275 AND THAT'S BECAUSE THIS PATHWAY, 318 00:11:15,275 --> 00:11:17,076 UNLIKE THE GLOBAL GENOMIC PATHWAY, 319 00:11:17,076 --> 00:11:19,479 HAS NOT BEEN FULLY RECONSTITUTED, 320 00:11:19,479 --> 00:11:21,147 WITH PURIFIED COMPONENTS. 321 00:11:21,147 --> 00:11:23,883 AND IN FACT, THERE IS NO CELL FREE SYSTEM 322 00:11:23,883 --> 00:11:26,319 CURRENTLY, AT LEAST UNTIL, 323 00:11:26,319 --> 00:11:29,522 RECENTLY THAT SUPPORTS THIS PATHWAY. 324 00:11:29,522 --> 00:11:30,857 AND SO AS A RESULT, 325 00:11:30,857 --> 00:11:32,191 THERE ARE REALLY SOME 326 00:11:32,191 --> 00:11:33,459 FUNDAMENTAL QUESTIONS 327 00:11:33,459 --> 00:11:34,794 THAT REMAIN UNANSWERED. 328 00:11:34,794 --> 00:11:37,797 ONE OF THOSE IS HOW IS TFIIH 329 00:11:38,197 --> 00:11:40,366 ACTUALLY PROPERLY POSITIONED? 330 00:11:40,366 --> 00:11:41,434 REMEMBER THAT 331 00:11:41,434 --> 00:11:43,403 IN ORDER FOR PRODUCTIVE REPAIR 332 00:11:43,403 --> 00:11:45,004 TO OCCUR, TFIIH 333 00:11:45,004 --> 00:11:48,575 HAS TO BIND DOWNSTREAM OF THE STALLED 334 00:11:48,575 --> 00:11:49,442 RNA POLYMERASE, 335 00:11:49,442 --> 00:11:52,478 AND IT HAS TO SPECIFICALLY ENGAGE 336 00:11:52,478 --> 00:11:55,381 THROUGH XPD WITH THE TRANSCRIBED STRAND. 337 00:11:55,381 --> 00:11:55,815 THAT'S REALLY 338 00:11:55,815 --> 00:11:57,116 THE ONLY CONFIGURATION 339 00:11:57,116 --> 00:11:59,018 THAT WILL LEAD TO A 340 00:11:59,018 --> 00:12:00,453 PRODUCTIVE REPAIR OUTCOME. 341 00:12:00,453 --> 00:12:01,254 AND HOW 342 00:12:01,254 --> 00:12:02,922 TFIIH IS POSITIONED IN THAT 343 00:12:02,922 --> 00:12:04,824 MANNER IS NOT UNDERSTOOD. 344 00:12:04,824 --> 00:12:06,859 FURTHERMORE, WE REALLY UNDERSTAND 345 00:12:06,859 --> 00:12:08,561 LITTLE ABOUT THE 346 00:12:08,561 --> 00:12:13,232 DYNAMICES OF, OF RNA POLYMERASE. 347 00:12:13,232 --> 00:12:16,502 SO, IN THE COURSE OF REPAIR, IS IT, 348 00:12:16,736 --> 00:12:17,971 IS IT BACKTRACKED 349 00:12:17,971 --> 00:12:20,473 AND DOES IT RESTART OR IS IT REMOVED? 350 00:12:20,473 --> 00:12:22,709 THESE ARE SOME UNANSWERED QUESTIONS. 351 00:12:22,709 --> 00:12:24,043 AND ALSO 352 00:12:24,043 --> 00:12:26,012 WHAT IS THE ROLE OF RNA 353 00:12:26,012 --> 00:12:27,780 POLYMERASE UBIQUITINATION. 354 00:12:27,780 --> 00:12:28,982 AND FINALLY, 355 00:12:28,982 --> 00:12:30,683 GIVEN THAT THE PROCESS 356 00:12:30,683 --> 00:12:32,251 HAS NOT BEEN RECONSTITUTED 357 00:12:32,251 --> 00:12:34,053 TO PURIFIED COMPONENTS, 358 00:12:34,053 --> 00:12:37,123 ARE WE ACTUALLY MISSING ANY KEY FACTORS. 359 00:12:37,123 --> 00:12:38,725 AND MIGHT SUCH FACTORS SHED 360 00:12:38,725 --> 00:12:41,728 LIGHT ON ON THE ABOVE QUESTIONS? 361 00:12:41,728 --> 00:12:43,062 SO TODAY I'M GOING TO TELL YOU 362 00:12:43,062 --> 00:12:44,397 ABOUT THE WORK OF A REALLY, 363 00:12:44,397 --> 00:12:45,732 REALLY TALENTED, 364 00:12:45,732 --> 00:12:46,666 POSTDOCTORAL FELLOW, 365 00:12:46,666 --> 00:12:47,500 TYCHO MEVISSEN, 366 00:12:47,500 --> 00:12:49,736 WHO ACTUALLY IS ON THE JOB MARKET. 367 00:12:49,736 --> 00:12:52,739 AND WHAT TYCHO DID WAS TO SAY, 368 00:12:53,172 --> 00:12:56,376 LET'S TRY TO ACTUALLY DEVELOP 369 00:12:56,376 --> 00:12:57,577 A CELL-FREE SYSTEM 370 00:12:57,577 --> 00:13:00,480 THAT SUPPORTS EUKARYOTIC TC-NER. 371 00:13:00,480 --> 00:13:02,482 THERE HAVE BEEN SEVERAL ATTEMPTS 372 00:13:02,482 --> 00:13:05,485 OVER THE YEARS TO ACHIEVE THIS, 373 00:13:05,551 --> 00:13:07,220 AND, UNFORTUNATELY, 374 00:13:07,220 --> 00:13:07,887 NONE OF THEM 375 00:13:07,887 --> 00:13:09,589 HAVE BEEN SUCCESSFUL IN AND 376 00:13:09,589 --> 00:13:10,923 AND THAT'S ACTUALLY PRETTY CLEAR 377 00:13:10,923 --> 00:13:11,824 FROM THE FACT 378 00:13:11,824 --> 00:13:13,660 THAT NONE OF THE CELL-FREE SYSTEMS 379 00:13:13,660 --> 00:13:14,761 THAT HAVE BEEN, 380 00:13:14,761 --> 00:13:16,562 PROPOSED OR DOCUMENTED 381 00:13:16,562 --> 00:13:19,599 ACTUALLY REALLY DEPEND ON THE CORE 382 00:13:19,599 --> 00:13:20,266 TC-NER 383 00:13:20,266 --> 00:13:24,003 FACTORS SUCH AS CSB AND CRL4CSA. 384 00:13:24,904 --> 00:13:27,907 SO TYCHO REASONED THAT 385 00:13:28,041 --> 00:13:30,043 PERHAPS GIVEN 386 00:13:30,043 --> 00:13:32,111 THE EXTRAORDINARY POWER 387 00:13:32,111 --> 00:13:34,313 OF XENOPUS EGG EXTRACTS, TO RECAPITULATE 388 00:13:35,281 --> 00:13:38,084 SO MANY GENOME MAINTENANCE PATHWAYS 389 00:13:38,084 --> 00:13:40,019 INCLUDING CHROMATIN ASSEMBLY, 390 00:13:40,019 --> 00:13:41,754 CHECKPOINT SIGNALING, 391 00:13:41,754 --> 00:13:43,456 REPLICATION AND ALL OF THESE REPAIR 392 00:13:43,456 --> 00:13:44,590 PATHWAYS, MIGHT 393 00:13:44,590 --> 00:13:47,226 IT ALSO BE ABLE TO SUPPORT TC-NER? 394 00:13:47,226 --> 00:13:49,028 THAT WAS REALLY, 395 00:13:49,028 --> 00:13:52,331 THE SORT OF THE OVERARCHING RATIONALE 396 00:13:52,331 --> 00:13:53,166 OF HIS PROJECT. 397 00:13:53,166 --> 00:13:55,635 THERE WAS ONLY ONE SMALL DETAIL, 398 00:13:55,635 --> 00:13:57,270 AND THAT IS THAT FROG 399 00:13:57,270 --> 00:13:59,005 EGG EXTRACTS, AT LEAST CONVENTIONAL 400 00:13:59,005 --> 00:14:02,775 EGG EXTRACTS, WERE KNOWN TO BE INACTIVE 401 00:14:02,909 --> 00:14:04,110 FOR TRANSCRIPTION. 402 00:14:04,110 --> 00:14:05,712 SO THE FIRST THING THAT TYCHO 403 00:14:05,712 --> 00:14:08,715 HAD TO DO TO TRY TO ACTIVATE TC-NER IN 404 00:14:08,715 --> 00:14:10,483 THIS SYSTEM WAS TO ACTIVATE 405 00:14:10,483 --> 00:14:11,818 TRANSCRIPTION ITSELF. 406 00:14:11,818 --> 00:14:13,252 AND THE WAY HE WENT ABOUT DOING 407 00:14:13,252 --> 00:14:16,489 THIS WAS TO GENERATE A PLASMID 408 00:14:16,489 --> 00:14:19,492 THAT CONTAINS, A VERY STRONG. 409 00:14:21,861 --> 00:14:22,695 SOMEHOW MY, 410 00:14:22,695 --> 00:14:24,731 MY MOUSE IS NOT RESPONDING PROPERLY. 411 00:14:24,731 --> 00:14:27,200 HERE. HOLD ON ONE SECOND. LET'S SEE. 412 00:14:27,200 --> 00:14:29,268 MAYBE THERE'S SOME JUNK ON MY. 413 00:14:29,268 --> 00:14:31,037 I'M, INTERESTING. 414 00:14:31,037 --> 00:14:32,739 HOLD ON ONE SECOND. 415 00:14:38,344 --> 00:14:39,679 OKAY. 416 00:14:39,679 --> 00:14:40,880 THIS IS FASCINATING. 417 00:14:40,880 --> 00:14:43,950 MY MOUSE DECIDED TO STOP WORKING. 418 00:14:45,918 --> 00:14:46,285 OKAY. 419 00:14:46,285 --> 00:14:49,288 WELL, SO, 420 00:14:49,455 --> 00:14:52,291 HE USED A PLASMID, 421 00:14:52,291 --> 00:14:54,861 CONTAINING A VERY STRONG CORE PROMOTER 422 00:14:54,861 --> 00:14:56,929 AS WELL AS A FEW UPSTREAM 423 00:14:56,929 --> 00:14:58,631 ACTIVATING SEQUENCES. 424 00:14:58,631 --> 00:15:00,333 AND HE ADDED THIS INTO EGG 425 00:15:00,333 --> 00:15:01,467 EXTRACT TOGETHER 426 00:15:01,467 --> 00:15:03,402 WITH A POTENT TRANSCRIPTIONAL ACTIVATOR 427 00:15:03,402 --> 00:15:05,972 GAL4VP64. 428 00:15:05,972 --> 00:15:09,475 AND INSTEAD OF USING CONVENTIONAL TOTAL 429 00:15:09,475 --> 00:15:10,777 EGG LYSATES, 430 00:15:10,777 --> 00:15:14,514 WHICH OTHERS HAD TRIED BEFORE AND REALLY 431 00:15:14,814 --> 00:15:16,749 DIDN'T LEAD TO ANY SIGNIFICANT 432 00:15:16,749 --> 00:15:18,284 TRANSCRIPTIONAL ACTIVATION, 433 00:15:18,284 --> 00:15:19,452 WHAT HE DID 434 00:15:19,452 --> 00:15:22,088 INSTEAD WAS TO USE A CONCENTRATED 435 00:15:22,088 --> 00:15:23,222 NUCLEARPLASMIC 436 00:15:23,222 --> 00:15:25,291 EXTRACT THAT IS HIGHLY ENRICHED 437 00:15:25,291 --> 00:15:27,393 IN NUCLEAR PROTEINS, 438 00:15:27,393 --> 00:15:29,395 INCLUDING RNA POLYMERASE, MEDIATOR, 439 00:15:29,395 --> 00:15:30,329 AND ALL OF THESE THINGS 440 00:15:30,329 --> 00:15:31,197 THAT THAT ARE NEEDED 441 00:15:31,197 --> 00:15:32,532 TO ACTIVATE TRANSCRIPTION. 442 00:15:32,532 --> 00:15:34,300 AND WHEN HE DID THIS 443 00:15:34,300 --> 00:15:37,170 AND ALSO INCLUDED RADIOACTIVE UTP, 444 00:15:37,170 --> 00:15:39,939 HE SAW A VERY POTENT TRANSCRIPTION. 445 00:15:39,939 --> 00:15:41,440 SO THIS WAS GREAT. 446 00:15:41,440 --> 00:15:44,811 AND, AND MEANT THAT HE HAD OVERCOME 447 00:15:44,811 --> 00:15:46,746 THE FIRST HURDLE TOWARDS 448 00:15:46,746 --> 00:15:48,681 RECAPITULATING TC-NER. 449 00:15:48,681 --> 00:15:49,849 AND I SHOULD NOTE THAT ACTUALLY, 450 00:15:49,849 --> 00:15:51,784 MY FORMER POSTDOC, DAVID LONG, 451 00:15:51,784 --> 00:15:54,020 WHO HAS HIS OWN LABORATORY 452 00:15:54,020 --> 00:15:55,321 IN SOUTH CAROLINA, 453 00:15:55,321 --> 00:15:56,923 ALSO HAS ACTIVATED 454 00:15:56,923 --> 00:15:58,691 TRANSCRIPTION IN FROG EGG EXTRACTS. 455 00:15:58,691 --> 00:15:59,292 ALTHOUGH, 456 00:15:59,292 --> 00:16:00,593 HIS APPROACH INVOLVES THE 457 00:16:00,593 --> 00:16:02,328 USE OF ENDOGENOUS PROMOTERS. 458 00:16:03,329 --> 00:16:05,164 SO, THE GOAL HERE, 459 00:16:05,164 --> 00:16:08,901 OF COURSE, WAS TO RECAPITULATE TC-NER. 460 00:16:08,901 --> 00:16:10,503 SO THE NEXT THING THAT TYCHO DID 461 00:16:10,503 --> 00:16:13,573 WAS TO ENGINEER A CLASSIC NUCLEOTIDE 462 00:16:13,573 --> 00:16:17,410 EXCISION REPAIR SUBSTRATE, A 1,3-GTG 463 00:16:17,510 --> 00:16:18,911 INTRASTRAND 464 00:16:18,911 --> 00:16:21,914 CROSSLINK INTO THE TEMPLATE STRAND. 465 00:16:22,014 --> 00:16:23,749 AND AS YOU CAN SEE FROM THE GEL 466 00:16:23,749 --> 00:16:25,218 OVER ON THE RIGHT, 467 00:16:25,218 --> 00:16:27,253 THIS LED TO VERY POTENT 468 00:16:27,253 --> 00:16:29,322 TRANSCRIPTIONAL ARREST. 469 00:16:29,322 --> 00:16:30,990 AND SO NOW THE QUESTION BECAME 470 00:16:30,990 --> 00:16:33,059 WHETHER ANY OF THESE LESIONS 471 00:16:33,059 --> 00:16:34,260 ARE REPAIRED 472 00:16:34,260 --> 00:16:37,129 IN A TRANSCRIPTION COUPLED MANNER. 473 00:16:37,129 --> 00:16:39,031 EXCUSE ME. 474 00:16:39,031 --> 00:16:41,234 AND TO ADDRESS THIS, 475 00:16:41,234 --> 00:16:43,102 TYCHO USED A SIMPLE TRICK 476 00:16:43,102 --> 00:16:45,938 WHICH WAS TO ASK 477 00:16:45,938 --> 00:16:48,641 WHETHER A PML1 RESTRICTION SITE 478 00:16:48,641 --> 00:16:49,775 THAT COINCIDES 479 00:16:49,775 --> 00:16:52,979 WITH THE LESION IS REGENERATED. 480 00:16:53,145 --> 00:16:55,781 SO THE INPUT PLASMID CANNOT BE CUT 481 00:16:55,781 --> 00:16:57,283 WITH THIS ENZYME DUE 482 00:16:57,283 --> 00:16:58,651 TO THE PRESENCE OF THE LESION. 483 00:16:58,651 --> 00:17:00,586 AND IF ANY ERROR-FREE REPAIR 484 00:17:00,586 --> 00:17:01,787 SHOULD OCCUR, 485 00:17:01,787 --> 00:17:04,423 THE PLASMID WILL BECOME CLEAVABLE. 486 00:17:05,424 --> 00:17:08,427 SO WHEN HE FIRST 487 00:17:08,728 --> 00:17:10,696 USED THIS ASSAY TO ASK 488 00:17:10,696 --> 00:17:12,765 WHETHER ANY REPAIR IS GOING ON, 489 00:17:12,765 --> 00:17:14,700 HE ACTUALLY COULD SEE 490 00:17:14,700 --> 00:17:17,970 THAT ABOUT 10 TO 30%, 491 00:17:17,970 --> 00:17:19,005 DEPENDING ON THE DAY 492 00:17:19,005 --> 00:17:21,941 AND THE EXTRACT OF THE INPUT PLASMID 493 00:17:21,941 --> 00:17:24,043 WAS UNDERGOING REPAIR. 494 00:17:24,043 --> 00:17:27,847 HOWEVER, HE THEN VERY QUICKLY UNDERSTOOD 495 00:17:28,214 --> 00:17:31,050 THAT THIS IS DUE TO GLOBAL GENOMIC REPAIR 496 00:17:31,050 --> 00:17:33,252 BECAUSE IF HE DEPLETED XPC, 497 00:17:33,252 --> 00:17:34,620 THE INITIATOR OF THE GLOBAL 498 00:17:34,620 --> 00:17:35,922 GENOMIC PATHWAY 499 00:17:35,922 --> 00:17:37,723 REPAIR WAS ELIMINATED 500 00:17:37,723 --> 00:17:38,991 EVEN UNDER CONDITIONS 501 00:17:38,991 --> 00:17:41,994 WHERE HE WAS ACTIVATING TRANSCRIPTION. 502 00:17:42,428 --> 00:17:44,797 SO IT TOOK US A LITTLE WHILE, 503 00:17:44,797 --> 00:17:45,631 BUT EVENTUALLY 504 00:17:45,631 --> 00:17:48,901 WE REALIZED THAT THIS IS QUITE POSSIBLY 505 00:17:48,901 --> 00:17:50,436 OR PROBABLY DUE, 506 00:17:50,436 --> 00:17:53,439 AT LEAST IN PART TO THE FACT THAT 507 00:17:53,606 --> 00:17:57,877 THE CONCENTRATION OF CLASSIC TC-NER 508 00:17:57,877 --> 00:17:58,911 PROTEINS, 509 00:17:58,911 --> 00:18:00,446 SUCH AS CSB AND CRL4CSA 510 00:18:00,446 --> 00:18:03,449 IN EGG EXTRACT IS VERY LOW. 511 00:18:04,517 --> 00:18:07,153 SO THEREFORE, TYCHO GENERATED 512 00:18:07,153 --> 00:18:11,791 A COCKTAIL OF KNOWN AND SUSPECTED 513 00:18:11,791 --> 00:18:15,061 TC-NER FACTORS THAT'S SHOWN, DOWN HERE. 514 00:18:15,561 --> 00:18:19,298 AND VERY EXCITINGLY, WHEN HE SUPPLEMENTED 515 00:18:19,398 --> 00:18:21,400 THE EGG EXTRACTS WITH THIS COCKTAIL, 516 00:18:21,400 --> 00:18:23,135 THIS SO-CALLED TC-NER COCKTAIL, 517 00:18:23,135 --> 00:18:26,072 HE NOW SAW A MASSIVE BOOST 518 00:18:26,072 --> 00:18:28,407 IN THE EFFICIENCY OF REPAIR, 519 00:18:28,407 --> 00:18:29,575 EVEN UNDER CONDITIONS 520 00:18:29,575 --> 00:18:32,011 WHERE GLOBAL GENOMIC REPAIR 521 00:18:32,011 --> 00:18:33,612 WAS STILL INHIBITED 522 00:18:33,612 --> 00:18:36,248 DUE TO THE REMOVAL OF XPC. 523 00:18:36,248 --> 00:18:38,117 AND HE GOT REALLY EXCITED 524 00:18:38,117 --> 00:18:40,853 WHEN HE FOUND THAT IF HE ALSO NOW 525 00:18:40,853 --> 00:18:42,588 INHIBITED TRANSCRIPTION 526 00:18:42,588 --> 00:18:43,990 USING ALPHA-AMANITIN, 527 00:18:43,990 --> 00:18:46,158 THAT REPAIR CAME BACK DOWN AGAIN. 528 00:18:46,158 --> 00:18:47,994 SO THIS REALLY SUGGESTED 529 00:18:47,994 --> 00:18:50,997 THAT FOR THE FIRST TIME HE COULD OBSERVE, 530 00:18:51,197 --> 00:18:54,367 BONA FIDE, TRANSCRIPTION COUPLED REPAIR. 531 00:18:54,367 --> 00:18:56,102 AND HE WENT ON AND EXPERIMENTS 532 00:18:56,102 --> 00:18:57,403 THAT I WON'T SHOW YOU 533 00:18:57,403 --> 00:18:58,704 TO DEMONSTRATE 534 00:18:58,704 --> 00:19:01,640 THAT THIS TRANSCRIPTION DEPENDENT 535 00:19:01,640 --> 00:19:03,342 REPAIR SIGNAL 536 00:19:03,342 --> 00:19:05,911 THAT HE SEES IS ABSOLUTELY DEPENDENT 537 00:19:05,911 --> 00:19:10,282 ON ALL FOUR OF THE CANONICAL TC-NER FACTORS. 538 00:19:10,282 --> 00:19:11,217 AND SO THAT REALLY 539 00:19:11,217 --> 00:19:13,152 CONVINCED US THAT FOR THE FIRST TIME, 540 00:19:13,152 --> 00:19:15,654 WE HAD BEEN ABLE TO RECAPITULATE 541 00:19:15,654 --> 00:19:19,058 BONA FIDE A TC-NER IN THE TEST TUBE. 542 00:19:19,058 --> 00:19:20,593 AND THIS NOW SET THE STAGE 543 00:19:20,593 --> 00:19:21,961 TO REALLY EMBARK 544 00:19:21,961 --> 00:19:25,331 ON A DETAILED MECHANISTIC ANALYSIS OF, 545 00:19:25,598 --> 00:19:26,599 OF THIS PATHWAY 546 00:19:26,599 --> 00:19:27,867 AND TO TRY TO ADDRESS SOME 547 00:19:27,867 --> 00:19:29,201 OF THE UNANSWERED QUESTIONS. 548 00:19:30,703 --> 00:19:32,538 SO THE FIRST, 549 00:19:32,538 --> 00:19:37,643 QUESTION THAT HE TACKLED WAS TO ADDRESS 550 00:19:37,643 --> 00:19:39,478 THE FUNCTION OF AN OBSCURE 551 00:19:39,478 --> 00:19:41,514 PROTEIN CALLED STK19 552 00:19:41,514 --> 00:19:44,917 THAT STANDS FOR SERINE/THREONINE KINASE 19 553 00:19:45,317 --> 00:19:48,320 AND IS A TOTAL MISNOMER BECAUSE, IN FACT, 554 00:19:48,687 --> 00:19:50,322 THE PROTEIN CONSISTS 555 00:19:50,322 --> 00:19:53,059 OF A SERIES OF WINGED HELIX DOMAINS. 556 00:19:53,059 --> 00:19:54,527 IT HAS NO KINASE FOLD. 557 00:19:54,527 --> 00:19:57,296 IT HAS NO DETECTABLE KINASE ACTIVITY. 558 00:19:57,296 --> 00:19:58,064 HOWEVER, WHAT'S 559 00:19:58,064 --> 00:19:59,365 INTERESTING ABOUT THE PROTEIN 560 00:19:59,365 --> 00:20:01,100 IS THAT IT HAD BEEN CONNECTED 561 00:20:01,100 --> 00:20:03,102 TO TC-NER, 562 00:20:03,102 --> 00:20:04,370 BY VARIOUS GROUPS 563 00:20:04,370 --> 00:20:06,639 IN DIFFERENT INDIRECT WAYS, 564 00:20:06,639 --> 00:20:07,907 THE MOST INTERESTING 565 00:20:07,907 --> 00:20:10,142 OF WHICH PROBABLY IS THAT 566 00:20:10,142 --> 00:20:11,677 IT IS REQUIRED 567 00:20:11,677 --> 00:20:12,611 FOR THE RESUMPTION 568 00:20:12,611 --> 00:20:16,449 OF TRANSCRIPTION AFTER, A UV 569 00:20:16,449 --> 00:20:19,452 INSULT OR CELLULAR UV EXPOSURE 570 00:20:20,352 --> 00:20:21,454 AS SEEN 571 00:20:21,454 --> 00:20:24,857 FOR ESSENTIALLY ALL OTHER CORE 572 00:20:24,857 --> 00:20:26,125 TC-NER FACTORS. 573 00:20:26,125 --> 00:20:26,926 AND IN FACT, 574 00:20:26,926 --> 00:20:28,894 WHEN TYCHO DEPLETED THE PROTEIN 575 00:20:28,894 --> 00:20:30,296 FROM HIS CELL FREE SYSTEM, 576 00:20:30,296 --> 00:20:33,466 HE SAW A COMPLETE ABOLITION OF CELL FREE, 577 00:20:33,833 --> 00:20:35,167 REPAIR, 578 00:20:35,167 --> 00:20:37,036 WHICH HE COULD RESCUE BY ADDING BACK 579 00:20:37,036 --> 00:20:39,705 RECOMBINANT STK19. 580 00:20:39,705 --> 00:20:40,706 SO THIS WAS GREAT. 581 00:20:40,706 --> 00:20:42,074 REALLY ARGUED THAT 582 00:20:42,074 --> 00:20:46,011 THAT, STK19 IS A TRUE 583 00:20:47,046 --> 00:20:48,814 TC-NER REPAIR FACTOR. 584 00:20:48,814 --> 00:20:50,416 AND IT RAISED THE QUESTION 585 00:20:50,416 --> 00:20:53,052 HOW STK19 FUNCTIONS. 586 00:20:53,052 --> 00:20:56,355 NOW, GIVEN THE ABSENCE OF ANY, 587 00:20:58,824 --> 00:20:59,492 CLEARLY 588 00:20:59,492 --> 00:21:02,495 IDENTIFIABLE ENZYMATIC ACTIVITY, 589 00:21:02,661 --> 00:21:06,232 THE NULL HYPOTHESIS BECAME THAT STK19 590 00:21:06,899 --> 00:21:08,067 FUNCTIONS IN TC-NER 591 00:21:08,067 --> 00:21:11,303 ARE BY INTERACTING WITH OTHER PROTEINS. 592 00:21:11,937 --> 00:21:16,809 AND SO, AS YOU ALL KNOW, IDENTIFYING 593 00:21:16,809 --> 00:21:18,344 FUNCTIONALLY RELEVANT PROTEIN 594 00:21:18,344 --> 00:21:21,580 PROTEIN INTERACTIONS IS VERY DIFFICULT. 595 00:21:21,747 --> 00:21:24,049 TYPICALLY ONE MIGHT DO AN 596 00:21:24,049 --> 00:21:25,651 IP MASS SPEC EXPERIMENT 597 00:21:25,651 --> 00:21:27,253 AND LOOK FOR 598 00:21:27,253 --> 00:21:28,754 THINGS THAT CO-IP 599 00:21:28,754 --> 00:21:30,689 WITH, WITH ONE'S PROTEIN OF INTEREST. 600 00:21:30,689 --> 00:21:32,358 BUT THEN THERE'S A LONG ROAD, 601 00:21:32,358 --> 00:21:34,360 IN TERMS OF 602 00:21:34,360 --> 00:21:37,663 VALIDATING WHICH OF THE MANY PROTEINS 603 00:21:37,663 --> 00:21:39,498 ONE ONE RECOVERS 604 00:21:39,498 --> 00:21:41,767 AS ACTUALLY BEING FUNCTIONALLY RELEVANT. 605 00:21:41,767 --> 00:21:42,935 AND THAT OFTEN INVOLVES 606 00:21:42,935 --> 00:21:44,370 AN ELABORATE STRUCTURE 607 00:21:44,370 --> 00:21:45,905 FUNCTION ANALYSIS, 608 00:21:45,905 --> 00:21:48,374 DOMAIN MAPPING AND, AND SO FORTH. 609 00:21:48,374 --> 00:21:50,342 AND THIS CAN REALLY BE A VERY, 610 00:21:50,342 --> 00:21:53,345 VERY LONG AND DIFFICULT PROCESS. 611 00:21:53,612 --> 00:21:56,582 AND SO OVER THE LAST COUPLE OF YEARS, 612 00:21:56,582 --> 00:21:58,918 STIMULATED BY THE STRUCTURE 613 00:21:58,918 --> 00:22:00,553 PREDICTION REVOLUTION, 614 00:22:00,553 --> 00:22:02,688 WE ASKED WHETHER WE CAN 615 00:22:02,688 --> 00:22:03,556 MAKE SORT OF A HAIL 616 00:22:03,556 --> 00:22:04,490 MARY PLAY 617 00:22:04,490 --> 00:22:06,825 AROUND THE CONVENTIONAL APPROACHES 618 00:22:06,825 --> 00:22:08,294 AND ACTUALLY HARNESS 619 00:22:08,294 --> 00:22:09,028 THE STRUCTURE 620 00:22:09,028 --> 00:22:12,198 PREDICTION REVOLUTION TO IDENTIFY 621 00:22:12,198 --> 00:22:13,933 FUNCTIONALLY RELEVANT 622 00:22:13,933 --> 00:22:16,569 PROTEIN-PROTEIN INTERACTIONS AND 623 00:22:16,569 --> 00:22:17,970 THIS IS THE WORK OF A VERY TALENTED 624 00:22:17,970 --> 00:22:20,439 GRADUATE STUDENT, ERNST SCHMID. 625 00:22:20,439 --> 00:22:21,473 AND ESSENTIALLY 626 00:22:21,473 --> 00:22:24,476 WHAT HE DID WAS TO USE THE, 627 00:22:25,144 --> 00:22:26,412 DEEP LEARNING STRUCTURE 628 00:22:26,412 --> 00:22:27,580 PREDICTION ALGORITHM. 629 00:22:27,580 --> 00:22:30,583 ALPHAFOLD-MULTIMER, A COUSIN OF ALPHAFOLD 630 00:22:30,883 --> 00:22:35,621 WHO'S REALLY, DEDICATED FUNCTION 631 00:22:35,621 --> 00:22:37,890 IS TO SOLVE THE STRUCTURE 632 00:22:37,890 --> 00:22:40,292 OF KNOWN PROTEIN-PROTEIN INTERACTIONS. 633 00:22:40,292 --> 00:22:41,860 AND THAT'S OBVIOUSLY SUPER USEFUL. 634 00:22:41,860 --> 00:22:42,194 AND THAT'S 635 00:22:42,194 --> 00:22:44,863 WHAT ALPHAFOLD-MULTIMER WAS DESIGNED FOR. 636 00:22:44,863 --> 00:22:48,167 BUT WHAT ERNST REALIZED, TOGETHER WITH, 637 00:22:48,167 --> 00:22:50,169 SEVERAL OTHER GROUPS LIKE THE BAKER LAB 638 00:22:50,169 --> 00:22:51,770 THE ELOFFSON LAB AND ALSO DAN 639 00:22:51,770 --> 00:22:52,871 DUROCHERS LAB, 640 00:22:52,871 --> 00:22:54,440 WAS THAT 641 00:22:54,440 --> 00:22:56,075 PERHAPS ONE COULD USE 642 00:22:56,075 --> 00:22:58,577 THIS STRUCTURE PREDICTION 643 00:22:58,577 --> 00:23:00,179 APPROACH TO ACTUALLY IDENTIFY 644 00:23:00,179 --> 00:23:01,814 NEW PROTEIN-PROTEIN INTERACTIONS. 645 00:23:01,814 --> 00:23:03,082 AND THE WAY THAT ERNST DOES 646 00:23:03,082 --> 00:23:05,351 THIS IS TO TAKE A BAIT PROTEIN 647 00:23:05,351 --> 00:23:08,254 AND THEN FOLD IT WITH, 648 00:23:08,254 --> 00:23:09,455 A LARGE NUMBER 649 00:23:09,455 --> 00:23:11,557 OF SO-CALLED PREY PROTEINS 650 00:23:11,557 --> 00:23:12,858 AND BASICALLY GENERATE 651 00:23:12,858 --> 00:23:15,127 BINARY STRUCTURE PREDICTIONS 652 00:23:15,127 --> 00:23:16,729 OF EACH PAIR USING ALPHAFOLD-MULTIMER, 653 00:23:18,197 --> 00:23:21,200 AND THEN, USE THE 654 00:23:21,700 --> 00:23:24,470 CONFIDENCE METRICS THAT ALPHAFOLD 655 00:23:24,470 --> 00:23:27,473 PUTS OUT TO PRIORITIZE THESE 656 00:23:27,873 --> 00:23:30,242 ALL OF THESE PREDICTIONS. 657 00:23:30,242 --> 00:23:32,645 AND ESSENTIALLY TO ASK 658 00:23:32,645 --> 00:23:33,445 WHICH OF THESE 659 00:23:33,445 --> 00:23:34,613 IS MOST LIKELY 660 00:23:34,613 --> 00:23:36,015 TO BE A FUNCTIONALLY 661 00:23:36,015 --> 00:23:37,616 RELEVANT INTERACTION? 662 00:23:37,616 --> 00:23:38,484 AND I'M NOT GOING TO GO 663 00:23:38,484 --> 00:23:40,719 INTO THE TECHNICAL DETAILS OF 664 00:23:40,719 --> 00:23:44,189 THE ALPHAFOLD CONFIDENCE METRICS 665 00:23:44,189 --> 00:23:45,124 THAT ARE LISTED HERE, 666 00:23:45,124 --> 00:23:48,127 THINGS LIKE IPTM, PLDDT AND SO FORTH 667 00:23:48,460 --> 00:23:50,562 JUST BECAUSE IT DOESN'T 668 00:23:50,562 --> 00:23:52,431 ADD MUCH TO THE DISCUSSION. 669 00:23:52,431 --> 00:23:55,434 BUT I WILL TELL YOU THAT FOR THESE 670 00:23:55,901 --> 00:23:56,268 AT LEAST 671 00:23:56,268 --> 00:23:57,703 FOR THESE LIMITED SCREENS, 672 00:23:57,703 --> 00:24:00,039 THESE HAVE BEEN REALLY QUITE USEFUL 673 00:24:00,039 --> 00:24:00,973 IN IDENTIFYING 674 00:24:00,973 --> 00:24:03,909 FUNCTIONALLY RELEVANT INTERACTIONS. 675 00:24:03,909 --> 00:24:07,846 AND SO SPURRED BY SOME EARLY SUCCESSES 676 00:24:07,846 --> 00:24:09,448 THAT I'LL TELL YOU ABOUT IN A MOMENT, 677 00:24:09,448 --> 00:24:11,583 ULTIMATELY, WE ACTUALLY WENT AHEAD 678 00:24:11,583 --> 00:24:12,818 AND FOLDED 679 00:24:12,818 --> 00:24:15,888 300 CORE GENOME MAINTENANCE PROTEINS 680 00:24:16,021 --> 00:24:16,789 WITH EACH OTHER, 681 00:24:16,789 --> 00:24:19,758 GENERATING A MATRIX OF ABOUT 40,000 PAIRS 682 00:24:19,758 --> 00:24:21,627 THAT WE MADE FREELY AVAILABLE 683 00:24:21,627 --> 00:24:22,528 AT THIS WEBSITE 684 00:24:22,528 --> 00:24:23,996 CALLED PREDICTOMES.ORG, 685 00:24:23,996 --> 00:24:26,465 WHICH HOPEFULLY MANY OF YOU HAVE SEEN 686 00:24:26,465 --> 00:24:30,636 AND AND AND ARE ACTUALLY USING AND 687 00:24:32,204 --> 00:24:32,838 I WILL 688 00:24:32,838 --> 00:24:33,906 TRY TO ILLUSTRATE THE 689 00:24:33,906 --> 00:24:36,108 POWER OF THIS RESOURCE 690 00:24:36,108 --> 00:24:37,309 USING ONE EXAMPLE 691 00:24:37,309 --> 00:24:39,978 BEFORE I COME BACK TO THE STK19 STORY. 692 00:24:39,978 --> 00:24:40,746 SO ONE OF THE 693 00:24:40,746 --> 00:24:42,981 THE EARLIEST SUCCESS STORIES 694 00:24:42,981 --> 00:24:46,385 FOR US WAS USING STRUCTURE PREDICTION 695 00:24:46,385 --> 00:24:47,119 TO IDENTIFY 696 00:24:47,119 --> 00:24:48,787 FUNCTIONALLY RELEVANT PARTNERS 697 00:24:48,787 --> 00:24:49,822 AND THEREFORE 698 00:24:49,822 --> 00:24:51,890 THE FUNCTION OF A PROTEIN CALLED DONSON. 699 00:24:51,890 --> 00:24:53,192 AND SO THAT PROTEIN 700 00:24:53,192 --> 00:24:54,426 HAD BEEN IMPLICATED 701 00:24:54,426 --> 00:24:55,928 IN SOME ASPECT OF DNA 702 00:24:55,928 --> 00:24:57,796 REPLICATION PREVIOUSLY. 703 00:24:57,796 --> 00:24:59,998 BUT WHAT IT REALLY DOES 704 00:24:59,998 --> 00:25:02,634 WAS QUITE OBSCURE. 705 00:25:02,634 --> 00:25:05,571 AND SO IN THIS STRUCTURE 706 00:25:05,571 --> 00:25:07,039 PREDICTION RESOURCE 707 00:25:07,039 --> 00:25:08,006 THAT WE'VE CREATED, 708 00:25:08,006 --> 00:25:11,210 YOU CAN, CLICK ON YOUR FAVORITE PROTEIN 709 00:25:11,210 --> 00:25:14,713 AND THEN RETRIEVE A PRIORITIZED LIST 710 00:25:14,713 --> 00:25:17,683 BASED ON THE ALPHAFOLD CONFIDENCE METRICS 711 00:25:18,283 --> 00:25:22,654 OF THE LIKELY OR POSSIBLE INTERACTORS. 712 00:25:22,654 --> 00:25:24,656 AND IN THE CASE OF DONSON, 713 00:25:24,656 --> 00:25:27,192 THE RESULTS WERE REALLY QUITE STRIKING. 714 00:25:27,192 --> 00:25:30,195 SO IN THIS LIST OF 300 PROTEINS, 715 00:25:30,195 --> 00:25:33,165 THE TOP 5 OR 6 PROTEINS, 716 00:25:33,165 --> 00:25:36,034 ALL INVOLVED FACTORS 717 00:25:36,034 --> 00:25:38,170 THAT HAD PREVIOUSLY BEEN IMPLICATED 718 00:25:38,170 --> 00:25:39,772 IN THE ASSEMBLY OF THE REPLICATIVE 719 00:25:39,772 --> 00:25:41,440 CMG HELICASE, 720 00:25:41,440 --> 00:25:43,041 WHICH CONSISTS OF THE MCM 721 00:25:43,041 --> 00:25:44,576 TWO THROUGH SEVEN ATPASE, 722 00:25:44,576 --> 00:25:45,911 AND THEN THESE TWO ATPASE 723 00:25:45,911 --> 00:25:49,181 COFACTORS GINS AND CDC45. 724 00:25:49,815 --> 00:25:52,050 SO WHAT ALPHAFOLD SUGGESTED 725 00:25:52,050 --> 00:25:54,052 WAS THAT DANCING INTERACTS WITH TOPB1 726 00:25:54,052 --> 00:25:55,854 FACTOR, 727 00:25:55,854 --> 00:25:57,523 PREVIOUSLY KNOWN TO BE REQUIRED 728 00:25:57,523 --> 00:25:58,724 FOR CMG ASSEMBLY 729 00:25:58,724 --> 00:26:00,292 MCM3, ONE OF THE MCM 730 00:26:00,292 --> 00:26:01,560 TWO THROUGH SEVEN SUBUNITS. 731 00:26:01,560 --> 00:26:01,860 SLD 732 00:26:01,860 --> 00:26:03,762 FIVE ONE OF THE GINS SUBUNITS 733 00:26:03,762 --> 00:26:05,931 AND DNA POLYMERASE EPSILON 734 00:26:05,931 --> 00:26:08,634 THE SECOND SUBUNIT, 735 00:26:08,634 --> 00:26:09,968 WHICH IN YEAST 736 00:26:09,968 --> 00:26:11,370 AT LEAST HAS BEEN IMPLICATED 737 00:26:11,370 --> 00:26:13,138 IN CMG ASSEMBLY. 738 00:26:13,138 --> 00:26:14,306 SO YOU CAN CLICK ON ANY 739 00:26:14,306 --> 00:26:17,042 ONE OF THESE PARTNERS AND RETRIEVE 740 00:26:17,042 --> 00:26:20,045 THE ACTUAL STRUCTURE PREDICTION, WHICH, 741 00:26:20,145 --> 00:26:23,148 HAS ATOMIC RESOLUTION. 742 00:26:23,682 --> 00:26:27,286 AND WHEN WE TOOK ALL OF THESE PREDICTIONS 743 00:26:27,286 --> 00:26:28,854 AND ASSEMBLED THEM 744 00:26:28,854 --> 00:26:31,223 INTO A SINGLE PICTURE, THE 745 00:26:31,223 --> 00:26:33,492 OR A POSSIBLE MECHANISM OF HOW DONSON 746 00:26:33,492 --> 00:26:34,526 AND PROMOTE 747 00:26:34,526 --> 00:26:36,195 CMG ASSEMBLY 748 00:26:36,195 --> 00:26:37,229 REALLY QUITE LITERALLY 749 00:26:37,229 --> 00:26:38,163 LEAPT OFF THE PAGE. 750 00:26:38,163 --> 00:26:41,467 AND THAT WAS THAT DONSON WOULD ORGANIZE 751 00:26:41,934 --> 00:26:43,836 TOPBP1, GINS 752 00:26:43,836 --> 00:26:45,838 AND POL EPSILON INTO WHAT WE NOW 753 00:26:45,838 --> 00:26:47,940 CALL A PRE-LOADING COMPLEX, 754 00:26:47,940 --> 00:26:49,308 AND THAT IT WOULD USE, 755 00:26:49,308 --> 00:26:50,776 ITS PREDICTED INTERACTION 756 00:26:50,776 --> 00:26:54,246 WITH MCM THREE TO REALLY GUIDE THE GINS 757 00:26:54,780 --> 00:26:55,814 HELICASE SUBUNIT 758 00:26:55,814 --> 00:26:59,485 TO ITS PRECISE LOCATION ON THE HELICASE 759 00:26:59,952 --> 00:27:01,987 AND THROUGH, 760 00:27:01,987 --> 00:27:04,523 A LONG SERIES OF STRUCTURE FUNCTION. 761 00:27:04,523 --> 00:27:06,124 EXPERIMENTS 762 00:27:06,124 --> 00:27:08,927 WHERE WE MUTATED THE PREDICTED INTERFACES 763 00:27:08,927 --> 00:27:11,930 AND ASKED WHETHER THE PRELOADING 764 00:27:11,930 --> 00:27:13,131 COMPLEX STILL FORMS 765 00:27:13,131 --> 00:27:15,133 AND WHETHER REPLICATION STILL OCCURS. 766 00:27:15,133 --> 00:27:16,568 WE WERE ABLE TO PROVIDE 767 00:27:16,568 --> 00:27:18,370 COMPELLING SUPPORT FOR THIS MODEL. 768 00:27:18,370 --> 00:27:20,272 AND I SHOULD ALSO MENTION 769 00:27:20,272 --> 00:27:21,440 THAT THIS WAS ONE OF THESE 770 00:27:21,440 --> 00:27:22,875 PERFECT STORM SITUATIONS 771 00:27:22,875 --> 00:27:25,878 WHERE A NUMBER OF OTHER GROUPS ALSO, 772 00:27:26,378 --> 00:27:28,514 WERE ABLE TO IMPLICATE DONSON 773 00:27:28,514 --> 00:27:30,749 IN THE ASSEMBLY OF CMG. 774 00:27:30,749 --> 00:27:33,752 AND SOME OF THESE GROUPS ACTUALLY ALSO, 775 00:27:34,119 --> 00:27:37,122 PROVIDED EXPERIMENTAL STRUCTURE 776 00:27:37,389 --> 00:27:40,726 DETERMINATION THAT AGREED VERY NICELY 777 00:27:40,726 --> 00:27:42,861 WITH, WITH THE STRUCTURE PREDICTION. 778 00:27:42,861 --> 00:27:45,797 SO THIS WAS A REALLY GRATIFYING CASE 779 00:27:45,797 --> 00:27:49,167 WHERE WE WERE ABLE TO USE THIS IN SILICO 780 00:27:49,167 --> 00:27:50,469 SCREENING APPROACH 781 00:27:50,469 --> 00:27:53,472 TO IDENTIFY FUNCTIONALLY RELEVANT 782 00:27:53,472 --> 00:27:56,375 INTERACTORS OF AN OBSCURE PROTEIN, 783 00:27:56,375 --> 00:27:57,342 SUCH AS DONSON. 784 00:27:58,310 --> 00:28:01,313 SO, NOW I'LL TELL YOU ABOUT 785 00:28:02,147 --> 00:28:04,516 SORT OF, I GUESS THE SIMILAR STORY 786 00:28:04,516 --> 00:28:06,251 WHERE WHERE 787 00:28:06,251 --> 00:28:08,353 THIS IN SILICO SCREENING APPROACH 788 00:28:08,353 --> 00:28:09,154 REALLY CRACKED 789 00:28:09,154 --> 00:28:10,088 OPEN THE, 790 00:28:10,088 --> 00:28:13,625 THE QUESTION OF HOW STK19 PROMOTES TC-NER. 791 00:28:13,625 --> 00:28:15,427 SO WHEN WE LOOKED 792 00:28:15,427 --> 00:28:17,729 AT THE LIST, 793 00:28:17,729 --> 00:28:19,531 THE PRIORITIZED LIST 794 00:28:19,531 --> 00:28:23,201 OF PREDICTED STK19 INTERACTORS, 795 00:28:23,201 --> 00:28:25,003 THE REALLY STRIKING FACT 796 00:28:25,003 --> 00:28:27,873 WAS THAT AMONG THE 300 797 00:28:27,873 --> 00:28:29,174 GENOME MAINTENANCE PROTEINS, 798 00:28:29,174 --> 00:28:30,642 THE TOP TWO HITS 799 00:28:30,642 --> 00:28:34,179 WERE THE CSA SUBUNIT OF THE CRL4CSA 800 00:28:34,246 --> 00:28:35,213 LIGASE, 801 00:28:35,213 --> 00:28:38,750 AND THE XPD SUBUNIT OF THE TFIIHIIH COMPLEX. 802 00:28:39,251 --> 00:28:41,119 AND THIS PRETTY MUCH IMMEDIATELY 803 00:28:41,119 --> 00:28:44,022 SUGGESTED A POSSIBLE MECHANISM 804 00:28:44,022 --> 00:28:44,523 BY WHICH 805 00:28:44,523 --> 00:28:46,558 STK19 PROMOTES TC-NER, 806 00:28:46,558 --> 00:28:48,327 AND THAT IS TO INCORPORATE 807 00:28:48,327 --> 00:28:50,562 INTO THE TC-NER COMPLEX 808 00:28:50,562 --> 00:28:51,530 AND THEN USE THE PROTEIN 809 00:28:51,530 --> 00:28:57,035 PROTEIN INTERACTION WITH XPD TO, RECRUIT 810 00:28:57,569 --> 00:29:00,272 OR AT LEAST POSITION TFIIH 811 00:29:00,272 --> 00:29:01,139 IN SUCH A WAY 812 00:29:01,139 --> 00:29:02,441 THAT XPD IS GUIDED 813 00:29:02,441 --> 00:29:04,543 TO THE TEMPLATE STRAND. 814 00:29:04,543 --> 00:29:06,244 SO IN ORDER TO ACTUALLY 815 00:29:06,244 --> 00:29:08,413 TO TEST THE VALIDITY OF THIS MODEL, 816 00:29:08,413 --> 00:29:09,615 WE DID A FEW THINGS. 817 00:29:09,615 --> 00:29:10,515 THE FIRST, 818 00:29:10,515 --> 00:29:13,251 I'LL FOCUS ON THE FIRST PART OF IT, 819 00:29:13,251 --> 00:29:16,455 WHICH IS THE IDEA THAT STK19 FORMS 820 00:29:16,455 --> 00:29:18,290 PART OF THE TC-NER COMPLEX. 821 00:29:18,290 --> 00:29:19,625 AND WE DID TWO THINGS. 822 00:29:19,625 --> 00:29:20,692 THE FIRST 823 00:29:20,692 --> 00:29:21,360 THAT WE COULD 824 00:29:21,360 --> 00:29:23,128 INITIATE IMMEDIATELY, ONCE WE HAD 825 00:29:23,128 --> 00:29:25,163 THE PREDICTION, WAS TO MUTAGENIZE 826 00:29:25,163 --> 00:29:26,632 THE PREDICTED INTERFACE, 827 00:29:26,632 --> 00:29:29,635 GUIDED BY THE ACTUAL MOLECULAR 828 00:29:30,135 --> 00:29:33,672 RESOLUTION STRUCTURE PREDICTION. 829 00:29:34,172 --> 00:29:36,975 AND WHEN WE MADE MUTATIONS IN STK 830 00:29:36,975 --> 00:29:38,076 19 THAT ARE PREDICTED 831 00:29:38,076 --> 00:29:40,245 TO DISRUPT THE INTERACTION WITH CSA, 832 00:29:40,245 --> 00:29:41,813 THE RESULTING STK19 833 00:29:41,813 --> 00:29:43,849 MUTANT WAS TOTALLY INACTIVE 834 00:29:43,849 --> 00:29:45,584 FOR CELL FREE TC-NER. 835 00:29:45,584 --> 00:29:48,553 SO THAT WAS CONSISTENT WITH THE IDEA. 836 00:29:48,553 --> 00:29:49,688 BUT WE ACTUALLY, 837 00:29:49,688 --> 00:29:51,523 WERE REALLY, REALLY LUCKY 838 00:29:51,523 --> 00:29:53,058 THAT WE WERE ABLE TO 839 00:29:53,058 --> 00:29:54,026 THEN COLLABORATE 840 00:29:54,026 --> 00:29:55,761 WITH MY COLLEAGUE LUCAS, 841 00:29:55,761 --> 00:29:58,063 FARNUNG IN THE CELL BIOLOGY DEPARTMENT. 842 00:29:58,063 --> 00:30:01,333 AND, HE HAD PRETTY MUCH EVERYTHING 843 00:30:01,867 --> 00:30:04,636 LYING AROUND, IN HIS FREEZERS 844 00:30:04,636 --> 00:30:05,604 THAT WAS NEEDED 845 00:30:05,604 --> 00:30:08,640 TO ASSEMBLE THE TC-NER COMPLEX. 846 00:30:08,974 --> 00:30:12,911 AND WE GAVE HIM STK19 AND ALL TOGETHER. 847 00:30:12,911 --> 00:30:14,913 THEN HE WAS ABLE TO SOLVE 848 00:30:14,913 --> 00:30:18,884 A, A CRYO-EM STRUCTURE OF AN STK19 849 00:30:18,884 --> 00:30:20,185 CONTAINING TC-NER 850 00:30:20,185 --> 00:30:21,953 COMPLEX AT A REALLY IMPRESSIVE 851 00:30:21,953 --> 00:30:25,223 RESOLUTION OF 1.9 ANGSTROM, AND THAT 852 00:30:25,757 --> 00:30:28,560 VERY, BEAUTIFULLY CONFIRMED 853 00:30:28,560 --> 00:30:31,630 THE INTERACTION OF STK19 WITH CSA. 854 00:30:31,630 --> 00:30:34,633 SO, THAT STRUCTURE LOOKS ALMOST EXACTLY, 855 00:30:34,933 --> 00:30:36,568 AS PREDICTED BY ALPHAFOLD. 856 00:30:36,568 --> 00:30:39,871 BUT IT ALSO SHOWED THAT STK19 857 00:30:39,871 --> 00:30:41,640 CAN BIND SIMULTANEOUSLY 858 00:30:41,640 --> 00:30:43,909 WITH CSA AND RBP1, 859 00:30:43,909 --> 00:30:46,745 AND THAT IT ALSO MAKES CLOSE CONTACT 860 00:30:46,745 --> 00:30:50,282 WITH THE DOWNSTREAM DNA, THAT IS, THE DNA 861 00:30:50,282 --> 00:30:53,919 THAT IS IN THE DIRECTION OF, OF MOVEMENT 862 00:30:54,252 --> 00:30:57,255 BEFORE RNA POLYMERASE GETS STALLED. 863 00:30:57,789 --> 00:31:01,026 SO SO THIS REALLY SUPPORTED THE IDEA 864 00:31:01,026 --> 00:31:02,127 THAT STK19 865 00:31:02,127 --> 00:31:03,095 IS AN INTEGRAL 866 00:31:03,095 --> 00:31:06,098 STRUCTURAL COMPONENT OF THE TC-NER COMPLEX. 867 00:31:06,264 --> 00:31:07,032 AND SO NOW WE WANTED 868 00:31:07,032 --> 00:31:08,400 TO TEST THE OTHER IDEA, 869 00:31:08,400 --> 00:31:11,570 WHICH IS THAT STK19 RECRUITS TFIIH. 870 00:31:11,803 --> 00:31:12,304 AND AGAIN 871 00:31:12,304 --> 00:31:15,373 WE STARTED WITH A MUTAGENESIS 872 00:31:15,373 --> 00:31:17,709 GUIDED BY THE STRUCTURE PREDICTION. 873 00:31:17,709 --> 00:31:18,076 AND AGAIN 874 00:31:18,076 --> 00:31:19,778 WE FOUND THAT WHEN WE MUTAGENIZE 875 00:31:19,778 --> 00:31:20,812 THIS INTERFACE 876 00:31:20,812 --> 00:31:23,815 THAT THE RESULTING STK19 PROTEIN 877 00:31:23,815 --> 00:31:27,986 WAS LARGELY INACTIVE FOR CELL FREE TC-NER. 878 00:31:29,087 --> 00:31:31,523 NOW, WE WOULD OF COURSE 879 00:31:31,523 --> 00:31:34,392 LOVE TO ALSO SOLVE THE STRUCTURE 880 00:31:34,392 --> 00:31:36,228 OF THIS ENTIRE ASSEMBLY, 881 00:31:36,228 --> 00:31:37,662 AND WE ARE WORKING ON THAT. 882 00:31:37,662 --> 00:31:38,597 BUT IN THE INTERIM, 883 00:31:38,597 --> 00:31:41,399 WHAT WE DID WAS TO ACTUALLY MODEL 884 00:31:41,399 --> 00:31:44,503 HOW THIS, THIS ASSEMBLY MIGHT LOOK. 885 00:31:44,903 --> 00:31:46,004 AND THE WAY THAT WE DID 886 00:31:46,004 --> 00:31:47,572 THAT WAS TO START 887 00:31:47,572 --> 00:31:50,575 WITH THE CRYO-EM STRUCTURE OF THE SDK 888 00:31:50,575 --> 00:31:53,578 19 CONTAINING TC-NER COMPLEX. 889 00:31:53,645 --> 00:31:56,481 AND WE THEN TOOK THE ALPHAFOLD PREDICTION 890 00:31:56,481 --> 00:31:59,718 OF THE STK19 PD INTERACTION. 891 00:31:59,718 --> 00:32:01,419 AND WE ALIGNED IT 892 00:32:01,419 --> 00:32:03,088 TO THE CRYO-EM STRUCTURE 893 00:32:03,088 --> 00:32:05,957 USING THE STK19 SUBUNIT ON THE ONE HAND. 894 00:32:05,957 --> 00:32:07,359 AND THEN ON THE OTHER HAND, 895 00:32:07,359 --> 00:32:10,662 WE TOOK A PREVIOUSLY DETERMINED AGAIN 896 00:32:10,662 --> 00:32:11,830 BY THE KRAMER LAB 897 00:32:11,830 --> 00:32:13,331 CRYO-EM STRUCTURE 898 00:32:13,331 --> 00:32:16,935 OF THE ENTIRE TFIIH COMPLEX IN ASSOCIATION 899 00:32:17,502 --> 00:32:20,372 WITH A SPLAYED DNA 900 00:32:20,372 --> 00:32:23,241 SUBSTRATE THAT SORT OF MIMICS 901 00:32:23,241 --> 00:32:27,012 THE TFIIH COMPLEX IN THE ACT OF LESION 902 00:32:27,012 --> 00:32:27,813 VERIFICATION. 903 00:32:27,813 --> 00:32:30,949 HERE, YOU CAN SEE THE SINGLE STRANDED, 904 00:32:30,949 --> 00:32:33,952 DNA GOING THROUGH THE XPD HELIX, 905 00:32:33,952 --> 00:32:37,455 AND WE ALIGNED THIS COMPLEX ON THE PRIOR 906 00:32:37,455 --> 00:32:40,058 ASSEMBLY USING THE XPD SUBUNIT. 907 00:32:40,058 --> 00:32:42,961 AND IF YOU LOOK AT THIS IN DETAIL, 908 00:32:42,961 --> 00:32:45,964 THE STRIKING FACT IS THAT THE 3’ 909 00:32:45,964 --> 00:32:47,632 END OF THE SINGLE STRAND 910 00:32:47,632 --> 00:32:49,034 EMERGING FROM XPD, 911 00:32:49,034 --> 00:32:51,303 THAT IS THE STRAND THAT IS UNDERGOING 912 00:32:51,303 --> 00:32:52,604 LESION VERIFICATION, 913 00:32:52,604 --> 00:32:53,538 OR AT LEAST MIMICKING 914 00:32:53,538 --> 00:32:54,539 THE PROCESS OF LESION 915 00:32:54,539 --> 00:32:57,909 VERIFICATION, IS VERY CLOSELY JUXTAPOSED 916 00:32:58,243 --> 00:32:59,511 WITH THE 5’ 917 00:32:59,511 --> 00:33:01,446 END OF THE TEMPLATE STRAND 918 00:33:01,446 --> 00:33:03,215 THAT IS PART OF THE 919 00:33:03,215 --> 00:33:05,917 THE STALLED, TC-NER COMPLEX. 920 00:33:05,917 --> 00:33:07,986 SO THIS IS REALLY ENTIRELY 921 00:33:07,986 --> 00:33:09,688 CONSISTENT WITH THE IDEA 922 00:33:10,822 --> 00:33:11,489 THAT STK 923 00:33:11,489 --> 00:33:15,026 19 REALLY GUIDES THE TFIIH COMPLEX 924 00:33:15,560 --> 00:33:18,363 TO THE RIGHT POSITION, SO THAT AFTER 925 00:33:18,363 --> 00:33:21,967 XPB UNWINDS THE DUPLEX, XPD 926 00:33:22,200 --> 00:33:24,236 WOULD BE ABLE TO ENGAGE 927 00:33:24,236 --> 00:33:26,304 WITH THE TEMPLATE STRAND. 928 00:33:26,304 --> 00:33:27,706 AND SO IN OTHER WORDS, 929 00:33:27,706 --> 00:33:29,741 WE REALLY THINK NOW THAT STK19 930 00:33:29,741 --> 00:33:30,876 IS THE MISSING LINK 931 00:33:30,876 --> 00:33:32,744 BETWEEN THE TC-NER COMPLEX 932 00:33:32,744 --> 00:33:34,212 AND THE DOWNSTREAM 933 00:33:34,212 --> 00:33:35,714 TFIIH DEPENDENT LESION 934 00:33:35,714 --> 00:33:37,782 VERIFICATION PROCESS. 935 00:33:37,782 --> 00:33:40,952 AND SO IF WE NOW THINK ABOUT THIS 936 00:33:40,952 --> 00:33:43,955 IN TERMS OF, YOU KNOW, THE WHOLE MODEL, 937 00:33:44,022 --> 00:33:45,190 WHAT WE'RE THINKING 938 00:33:45,190 --> 00:33:46,424 IS THAT AFTER RNA POLYMERASE 939 00:33:46,424 --> 00:33:48,226 STALLS AT A LESION, 940 00:33:48,226 --> 00:33:51,463 THERE IS THE ASSEMBLY OF THE TC-NER COMPLEX 941 00:33:51,463 --> 00:33:55,100 THAT WE NOW APPRECIATE INCLUDES STK19. 942 00:33:55,901 --> 00:33:58,737 THE NEXT STEP IS THE RECRUITMENT OF 943 00:33:58,737 --> 00:33:59,204 TFIIH 944 00:33:59,204 --> 00:34:01,640 AND I DIDN'T MENTION THIS BEFORE, 945 00:34:01,640 --> 00:34:02,874 BUT I'LL MENTION IT NOW. 946 00:34:02,874 --> 00:34:05,911 SEVERAL GROUPS HAVE SHOWN THAT 947 00:34:06,177 --> 00:34:08,413 IN ORDER TO RECRUIT 948 00:34:08,413 --> 00:34:11,016 TFIIH TO A TC-NER COMPLEX, 949 00:34:11,016 --> 00:34:12,050 YOU ACTUALLY REQUIRE 950 00:34:12,050 --> 00:34:13,251 AN INTERACTION 951 00:34:13,251 --> 00:34:16,021 BETWEEN THE P62 SUBUNIT OF TFIIH 952 00:34:16,021 --> 00:34:19,224 AND THIS TFIIH INTERACTING REGION IN, 953 00:34:19,591 --> 00:34:21,192 A SHORT PEPTIDE 954 00:34:21,192 --> 00:34:21,893 THAT SITS 955 00:34:21,893 --> 00:34:22,994 ON AN UNSTRUCTURED 956 00:34:22,994 --> 00:34:25,730 LOOP OF THE UVSSA PROTEIN. 957 00:34:25,730 --> 00:34:29,067 SO WE ACTUALLY THINK THAT UVSSA IS 958 00:34:29,067 --> 00:34:31,870 THE SORT OF INITIAL CAPTURE 959 00:34:31,870 --> 00:34:33,872 POINT FOR TFIIH. 960 00:34:33,872 --> 00:34:36,942 BUT WE BELIEVE THAT THE PROTEIN 961 00:34:36,942 --> 00:34:37,876 PROTEIN INTERACTION 962 00:34:37,876 --> 00:34:41,513 BETWEEN STK19 AND XPD IS WHAT ULTIMATELY 963 00:34:41,746 --> 00:34:44,883 PROPERLY POSITIONS TFIIH IN PRECISELY 964 00:34:45,717 --> 00:34:47,452 THE CONFIGURATION THAT IS NEEDED 965 00:34:47,452 --> 00:34:50,455 FOR PRODUCTIVE LESION VERIFICATION. 966 00:34:50,689 --> 00:34:52,557 AND OF COURSE, THE DOWNSTREAM EVENTS 967 00:34:52,557 --> 00:34:56,094 THEN ARE ASSEMBLY OF AN INCISION 968 00:34:56,094 --> 00:34:59,097 COMPLEX AND GAP FILLING. 969 00:34:59,264 --> 00:35:02,500 NOW, I WOULD LIKE TO, POINT OUT 970 00:35:02,500 --> 00:35:03,902 ONE ASPECT OF THIS 971 00:35:03,902 --> 00:35:05,804 THAT IS DESCRIBED IN DETAIL 972 00:35:05,804 --> 00:35:07,339 IN TYCHO'S PAPER. 973 00:35:07,339 --> 00:35:08,940 I DON'T HAVE TIME TO SHOW YOU THE DATA, 974 00:35:08,940 --> 00:35:10,542 BUT I WANT TO POINT OUT 975 00:35:10,542 --> 00:35:13,578 THAT ACTUALLY THE CRYO-EM STRUCTURE 976 00:35:14,079 --> 00:35:16,915 OF THE STK19 CONTAINING TC 977 00:35:16,915 --> 00:35:20,986 NER COMPLEX REALLY SHOWS THAT, 978 00:35:21,419 --> 00:35:25,957 THE THE VHS DOMAIN OF UVSSA 979 00:35:26,591 --> 00:35:30,462 ACTUALLY BLOCKS THE SITE ON STK19, 980 00:35:30,462 --> 00:35:33,665 WHERE XPD WOULD NORMALLY BE RECRUITED. 981 00:35:33,765 --> 00:35:36,134 AND SO THAT WAS A BIT OF A PARADOX. 982 00:35:36,134 --> 00:35:38,703 BUT THEN WE REALIZED THAT ACTUALLY 983 00:35:38,703 --> 00:35:41,673 UVSSA INTERACTS WITH THIS FLEXIBLE, 984 00:35:41,673 --> 00:35:45,677 C-TERMINAL TAIL OF CSA 985 00:35:46,277 --> 00:35:49,614 THAT WE THINK ALLOWS UVSSA 986 00:35:49,614 --> 00:35:51,416 TO SWING OUT OF THE WAY 987 00:35:51,416 --> 00:35:53,952 TO MAKE ROOM FOR XPD. 988 00:35:53,952 --> 00:35:56,688 AND IN FACT, WE MADE A SINGLE POINT 989 00:35:56,688 --> 00:35:59,691 MUTATION IN THIS C-TERMINAL TAIL OF CSA, 990 00:36:00,091 --> 00:36:02,827 THAT WE SHOWED TOTALLY DISRUPTS THE, 991 00:36:02,827 --> 00:36:05,830 THE ABILITY OF CSA TO SUPPORT, 992 00:36:05,897 --> 00:36:08,800 TC-NER, IN TOTAL AGREEMENT WITH, 993 00:36:08,800 --> 00:36:09,768 WITH THIS MODEL. 994 00:36:10,969 --> 00:36:12,003 SO THAT WAS ONE, 995 00:36:12,003 --> 00:36:13,638 SORT OF ADDITIONAL WRINKLE 996 00:36:13,638 --> 00:36:15,040 THAT I WANTED TO POINT OUT. 997 00:36:15,040 --> 00:36:18,043 NOW, THERE ARE OTHER, SORT OF 998 00:36:18,710 --> 00:36:19,878 OBVIOUS QUESTIONS 999 00:36:19,878 --> 00:36:21,546 THAT THAT RESULT FROM THIS MODEL, 1000 00:36:21,546 --> 00:36:25,550 ONE OF WHICH IS HOW THE SO-CALLED CAK 1001 00:36:25,550 --> 00:36:29,487 DOMAIN OF THE TFIIH COMPLEX IS REMOVED. 1002 00:36:29,487 --> 00:36:31,089 SO IT'S KNOWN THAT THE CAK 1003 00:36:31,089 --> 00:36:32,957 DOMAIN IS REQUIRED FOR TFIIH, 1004 00:36:32,957 --> 00:36:35,293 ITS FUNCTION IN TRANSCRIPTION INITIATION, 1005 00:36:35,293 --> 00:36:37,328 BUT IT'S ACTUALLY INHIBITORY 1006 00:36:37,328 --> 00:36:40,331 FOR TC-NER AND HAS TO BE REMOVED. 1007 00:36:40,732 --> 00:36:43,301 AND, THE QUESTION IS, 1008 00:36:43,301 --> 00:36:46,304 AT WHAT STAGE OF THIS RECRUITMENT 1009 00:36:46,304 --> 00:36:48,706 OR THIS CAPTURE AND DOCKING, 1010 00:36:48,706 --> 00:36:50,408 PATHWAY FOR TFIIH 1011 00:36:50,408 --> 00:36:52,477 THE CAK MODULE IS REMOVED 1012 00:36:52,477 --> 00:36:53,144 AND THAT'S SOMETHING 1013 00:36:53,144 --> 00:36:56,147 THAT WE ARE ACTIVELY THINKING ABOUT. 1014 00:36:56,414 --> 00:36:57,482 AND THEN OF COURSE, 1015 00:36:57,482 --> 00:36:59,250 YOU KNOW, BACK TO 1016 00:36:59,250 --> 00:37:00,752 SOME OF THE CLASSIC QUESTIONS 1017 00:37:00,752 --> 00:37:02,954 LIKE WHAT DOES RNA POLYMERASE 1018 00:37:02,954 --> 00:37:04,089 ACTUALLY DO? 1019 00:37:04,089 --> 00:37:05,223 WHAT IS ITS FATE 1020 00:37:05,223 --> 00:37:08,326 DURING DURING THESE, VARIOUS, 1021 00:37:08,793 --> 00:37:10,361 REPAIR STEPS? 1022 00:37:10,361 --> 00:37:12,797 DOES IT DOES IT BACKTRACK AND RESTART 1023 00:37:12,797 --> 00:37:14,399 OR IS IT REMOVED? 1024 00:37:14,399 --> 00:37:15,800 THESE ARE ALSO VERY, 1025 00:37:15,800 --> 00:37:17,802 I THINK INTERESTING QUESTIONS THAT 1026 00:37:17,802 --> 00:37:19,737 THAT PERHAPS NOW BECOME TRACTABLE. 1027 00:37:21,439 --> 00:37:22,173 OKAY. 1028 00:37:22,173 --> 00:37:23,208 SO THAT'S 1029 00:37:23,208 --> 00:37:23,741 WHAT I WANTED 1030 00:37:23,741 --> 00:37:26,711 TO TELL YOU ABOUT TC-NER 1031 00:37:27,045 --> 00:37:29,814 AND HOW STRUCTURE-PREDICTION 1032 00:37:29,814 --> 00:37:32,584 REALLY ACCELERATED OUR PROGRESS IN 1033 00:37:32,584 --> 00:37:35,653 IN UNDERSTANDING HOW THIS NEW FACTOR, 1034 00:37:35,653 --> 00:37:38,623 STK19 PROMOTES THIS REACTION. 1035 00:37:38,957 --> 00:37:40,758 BUT WHAT I WANT TO DO IN THE LAST 1036 00:37:40,758 --> 00:37:41,960 COUPLE OF MINUTES 1037 00:37:41,960 --> 00:37:45,130 HERE IS TO ASK THE QUESTION, 1038 00:37:45,130 --> 00:37:45,763 WHAT ARE REALLY 1039 00:37:45,763 --> 00:37:48,399 THE LIMITS OF THIS IN SILICO 1040 00:37:48,399 --> 00:37:50,568 SCREENING APPROACH? 1041 00:37:50,568 --> 00:37:52,370 SO 1042 00:37:52,370 --> 00:37:54,539 IN OTHER WORDS, 1043 00:37:54,539 --> 00:37:56,274 CONSIDER AN EXTREME SCENARIO 1044 00:37:56,274 --> 00:37:58,176 WHERE YOU HAVE A PROTEIN 1045 00:37:58,176 --> 00:38:00,812 THAT, YOU FIND INTERESTING 1046 00:38:00,812 --> 00:38:02,113 FOR WHATEVER REASON. 1047 00:38:02,113 --> 00:38:03,448 BUT LET'S SAY, YOU KNOW 1048 00:38:03,448 --> 00:38:05,650 ESSENTIALLY NOTHING ABOUT IT. 1049 00:38:05,650 --> 00:38:08,319 COULD YOU ACTUALLY CARRY OUT 1050 00:38:08,319 --> 00:38:11,322 A PROTEOME WIDE IN SILICO SCREEN 1051 00:38:11,656 --> 00:38:13,658 IN ORDER TO IDENTIFY 1052 00:38:13,658 --> 00:38:14,792 FUNCTIONALLY RELEVANT 1053 00:38:14,792 --> 00:38:15,827 INTERACTORS, WHICH, 1054 00:38:15,827 --> 00:38:16,728 OF COURSE, 1055 00:38:16,728 --> 00:38:19,664 WOULD PROBABLY TELL YOU A LOT ABOUT 1056 00:38:19,664 --> 00:38:20,865 NOT ONLY THE PATHWAY 1057 00:38:20,865 --> 00:38:22,300 IN WHICH THE PROTEIN OPERATES, 1058 00:38:22,300 --> 00:38:25,303 BUT ALSO ABOUT ITS MECHANISM. 1059 00:38:25,837 --> 00:38:29,841 SO, SO WE WERE VERY CURIOUS ABOUT THIS. 1060 00:38:30,341 --> 00:38:34,045 SO WE WENT AHEAD AND ACTUALLY SCALED UP 1061 00:38:34,145 --> 00:38:35,280 SOME OF THE SCREENS 1062 00:38:35,280 --> 00:38:36,214 THAT I'VE TOLD YOU ABOUT. 1063 00:38:36,214 --> 00:38:39,517 SO, WE WENT AHEAD AND FOLDED STK19 1064 00:38:39,517 --> 00:38:42,720 WITH ALL 20,000 HUMAN PROTEINS. 1065 00:38:42,720 --> 00:38:45,723 AND THEN WE USED VARIOUS METRICS TO ASK, 1066 00:38:46,057 --> 00:38:47,525 WHERE 1067 00:38:47,525 --> 00:38:49,827 IN THIS LIST OF 20,000 1068 00:38:49,827 --> 00:38:51,663 BINARY STRUCTURE PREDICTIONS, 1069 00:38:51,663 --> 00:38:54,532 WOULD THE FUNCTIONALLY RELEVANT PARTNERS 1070 00:38:54,532 --> 00:38:56,301 THAT WE NOW, 1071 00:38:56,301 --> 00:38:57,569 KNOW ABOUT, 1072 00:38:57,569 --> 00:39:00,471 WHERE WOULD THEY SIT IN THIS RANK LIST? 1073 00:39:00,471 --> 00:39:02,173 AND SO IF YOU TAKE ONE OF THE, 1074 00:39:02,173 --> 00:39:05,176 VERY POPULAR CONFIDENCE METRICS 1075 00:39:05,176 --> 00:39:06,277 CALLED PDOCKQ, 1076 00:39:06,277 --> 00:39:08,413 THE RESULTS ARE REALLY QUITE TERRIBLE. 1077 00:39:08,413 --> 00:39:09,147 SO THE, 1078 00:39:09,147 --> 00:39:12,650 THE FUNCTIONALLY RELEVANT PARTNERS, CSA, 1079 00:39:12,650 --> 00:39:14,986 WHICH ALSO GOES BY THE NAME OF ERCC8 1080 00:39:14,986 --> 00:39:15,820 AND XPD, 1081 00:39:15,820 --> 00:39:18,489 WHICH GOES BY THE NAME OF ERCC2, 1082 00:39:18,489 --> 00:39:20,992 AS WELL AS RNA POLYMERASE 1083 00:39:20,992 --> 00:39:23,561 ARE, YOU KNOW, NOWHERE NEAR THE TOP. 1084 00:39:23,561 --> 00:39:25,296 OKAY. SO THAT LOOKS PRETTY BAD. 1085 00:39:25,296 --> 00:39:27,699 IPTM DOES A LOT BETTER. 1086 00:39:27,699 --> 00:39:29,567 SO NOW THESE THREE FUNCTIONALLY 1087 00:39:29,567 --> 00:39:30,034 RELEVANT 1088 00:39:30,034 --> 00:39:34,706 PARTNERS ARE IN THE TOP 60 OR SO, PAIRS. 1089 00:39:34,706 --> 00:39:36,741 SO THAT'S REALLY QUITE GOOD. 1090 00:39:36,741 --> 00:39:40,578 BUT, YOU KNOW, STILL MAYBE 1091 00:39:40,578 --> 00:39:41,579 NOT GOOD ENOUGH 1092 00:39:41,579 --> 00:39:42,847 IF YOU DIDN'T KNOW ANYTHING 1093 00:39:42,847 --> 00:39:44,449 ABOUT THE PROTEIN. 1094 00:39:44,449 --> 00:39:46,050 WE DID ANOTHER EXAMPLE. 1095 00:39:46,050 --> 00:39:49,020 SO REMEMBER THE DONSON PROTEIN 1096 00:39:49,020 --> 00:39:49,787 THAT IS REQUIRED 1097 00:39:49,787 --> 00:39:51,389 FOR CMG HELICASE 1098 00:39:51,389 --> 00:39:52,790 ASSEMBLY AND THAT INTERACTS 1099 00:39:52,790 --> 00:39:53,791 FUNCTIONALLY 1100 00:39:53,791 --> 00:39:56,794 WITH, THESE 4 OR 5 FACTORS HERE. 1101 00:39:57,061 --> 00:39:58,830 IF YOU USE PDOCKQ, 1102 00:39:58,830 --> 00:40:00,698 THE RESULTS ARE REALLY PRETTY TERRIBLE. 1103 00:40:00,698 --> 00:40:02,267 IF YOU USE IPTM, IT GETS BETTER. 1104 00:40:02,267 --> 00:40:03,601 BUT YOU KNOW, THE 1105 00:40:03,601 --> 00:40:04,636 THE FIVE PARTNERS 1106 00:40:04,636 --> 00:40:07,639 HERE ARE SPREAD OVER THE TOP 1000 HITS. 1107 00:40:07,739 --> 00:40:09,841 AND AND SO AGAIN, 1108 00:40:09,841 --> 00:40:11,576 THAT'S NOT TERRIBLY USEFUL 1109 00:40:11,576 --> 00:40:14,112 IF YOU DON'T KNOW MUCH ABOUT THE PROTEIN 1110 00:40:14,112 --> 00:40:15,413 TO BEGIN WITH. 1111 00:40:15,413 --> 00:40:16,781 SO FOR THIS FOR US, 1112 00:40:16,781 --> 00:40:19,784 THIS REALLY SUGGESTED THAT WE NEED BETTER 1113 00:40:20,118 --> 00:40:21,919 CONFIDENCE METRICS 1114 00:40:21,919 --> 00:40:22,954 TO IDENTIFY 1115 00:40:22,954 --> 00:40:24,722 FUNCTIONALLY RELEVANT INTERACTORS. 1116 00:40:24,722 --> 00:40:29,060 SO WHAT WE DID WAS TO ACTUALLY USE 1117 00:40:29,294 --> 00:40:29,927 SORT OF 1118 00:40:29,927 --> 00:40:31,429 CLASSICAL MACHINE 1119 00:40:31,429 --> 00:40:34,432 LEARNING APPROACHES TO TRAIN A CLASSIFIER 1120 00:40:34,699 --> 00:40:37,368 THAT WE HOPED WOULD BE ABLE TO TELL 1121 00:40:37,368 --> 00:40:39,337 THE DIFFERENCE BETWEEN 1122 00:40:40,571 --> 00:40:41,639 FUNCTIONAL 1123 00:40:41,639 --> 00:40:42,974 ALPHAFOLD PREDICTIONS 1124 00:40:42,974 --> 00:40:45,510 AND SPURIOUS ALPHAFOLD PREDICTIONS. 1125 00:40:45,510 --> 00:40:46,911 AND SO WE USED, 1126 00:40:46,911 --> 00:40:47,445 YOU KNOW, THE 1127 00:40:47,445 --> 00:40:49,347 CLASSIC, APPROACH, 1128 00:40:49,347 --> 00:40:52,350 WHICH IS TO ACTUALLY TRAIN A CLASSIFIER 1129 00:40:52,517 --> 00:40:56,120 ON A A SET OF KNOWN DATA POINTS. 1130 00:40:56,120 --> 00:41:00,024 SO WE TOOK A LARGE NUMBER OF ALPHAFOLD 1131 00:41:00,024 --> 00:41:02,226 BINARY ALPHAFOLD PREDICTIONS, 1132 00:41:02,226 --> 00:41:02,660 THE VAST 1133 00:41:02,660 --> 00:41:05,763 MAJORITY OF WHICH WE KNEW TO BE NEGATIVE 1134 00:41:05,763 --> 00:41:07,031 BECAUSE THEY CORRESPONDED 1135 00:41:07,031 --> 00:41:10,401 TO COMPLETELY RANDOM PAIRINGS, 1136 00:41:10,735 --> 00:41:14,772 AS WELL AS A SET OF PAIRS. 1137 00:41:14,772 --> 00:41:16,507 ALPHAFOLD BINARY 1138 00:41:16,507 --> 00:41:17,508 STRUCTURE PREDICTIONS 1139 00:41:17,508 --> 00:41:19,844 THAT WE PRESUMED TO BE POSITIVE 1140 00:41:19,844 --> 00:41:21,979 BECAUSE THEY CORRESPOND TO PROTEIN PAIRS 1141 00:41:21,979 --> 00:41:22,914 THAT ACTUALLY CROSS 1142 00:41:22,914 --> 00:41:24,282 LINK IN VIVO 1143 00:41:24,282 --> 00:41:27,185 IN CROSSLINKING MASS SPEC EXPERIMENTS. 1144 00:41:27,185 --> 00:41:29,153 SO WE MIXED THESE TOGETHER, 1145 00:41:29,153 --> 00:41:33,658 AND THEN WE TRY TO TRAIN A CLASSIFIER 1146 00:41:34,092 --> 00:41:36,694 THAT IS INFORMED BY, 1147 00:41:36,694 --> 00:41:38,463 A NUMBER OF DIFFERENT FEATURES. 1148 00:41:38,463 --> 00:41:39,630 AND SO THESE FEATURES 1149 00:41:39,630 --> 00:41:42,166 THAT WE FED THE CLASSIFIER FALL 1150 00:41:42,166 --> 00:41:43,534 INTO TWO DIFFERENT GROUPS. 1151 00:41:43,534 --> 00:41:45,903 ONE WE CALL STRUCTURAL FEATURES. 1152 00:41:47,071 --> 00:41:47,538 SO THESE 1153 00:41:47,538 --> 00:41:50,541 ARE NOT ONLY THE 1154 00:41:50,541 --> 00:41:53,978 ALPHAFOLD DERIVED CONFIDENCE METRICS 1155 00:41:53,978 --> 00:41:55,613 THAT COME WITH EVERY BINARY 1156 00:41:55,613 --> 00:41:59,250 STRUCTURE PREDICTION IPTM, PIDDT 1157 00:41:59,484 --> 00:42:00,318 AND SO FORTH, 1158 00:42:00,318 --> 00:42:04,055 BUT ALSO MEASURABLE PROPERTIES OF THE 1159 00:42:04,222 --> 00:42:05,590 PREDICTIONS THEMSELVES. 1160 00:42:05,590 --> 00:42:09,427 SO HOW MANY HYDROGEN AND SALT BRIDGES, 1161 00:42:09,794 --> 00:42:11,629 ARE AT THE INTERFACE? 1162 00:42:11,629 --> 00:42:13,898 WHAT IS THE SIZE OF THE, 1163 00:42:13,898 --> 00:42:16,901 OF THE INTERFACE AND SO FORTH? 1164 00:42:17,068 --> 00:42:18,403 AND WE ACTUALLY INITIALLY 1165 00:42:18,403 --> 00:42:19,737 TRAINED A CLASSIFIER 1166 00:42:19,737 --> 00:42:21,572 USING ONLY THESE STRUCTURAL FEATURES. 1167 00:42:21,572 --> 00:42:23,007 AND ITS PERFORMANCE 1168 00:42:23,007 --> 00:42:24,375 WAS WAS DECENT, 1169 00:42:24,375 --> 00:42:27,378 BUT NOT QUITE WHERE WE WANTED IT TO BE. 1170 00:42:27,612 --> 00:42:28,946 SO WE THEN REASONED THAT 1171 00:42:28,946 --> 00:42:31,015 PERHAPS WE ALSO NEED TO BE LOOKING 1172 00:42:31,015 --> 00:42:32,784 AT THE BIOLOGICAL PROPERTIES 1173 00:42:32,784 --> 00:42:35,353 OF ALL OF THE PROTEIN. 1174 00:42:35,353 --> 00:42:37,321 PAIRS THAT WE'RE LOOKING AT. 1175 00:42:37,321 --> 00:42:41,225 AND SO WE THEN ALSO CONSIDERED, 1176 00:42:41,826 --> 00:42:45,396 OMICS DERIVED BIOLOGICAL FEATURES 1177 00:42:45,396 --> 00:42:48,433 SUCH AS CO-EXPRESSION DATA, CO-LOCALIZATION. 1178 00:42:48,433 --> 00:42:50,902 ARE THE TWO PROTEINS AND THE PAIR, 1179 00:42:50,902 --> 00:42:52,970 DO THEY SHOW A SIMILAR DEPENDENCY 1180 00:42:52,970 --> 00:42:53,738 AND THE THOUSAND 1181 00:42:53,738 --> 00:42:54,505 OR SO 1182 00:42:54,505 --> 00:42:56,007 GENOME WIDE CRISPR SCREENS 1183 00:42:56,007 --> 00:42:57,942 THAT HAVE BEEN REPORTED? 1184 00:42:57,942 --> 00:42:59,544 IS THERE EVIDENCE 1185 00:42:59,544 --> 00:43:00,511 FOR ASSOCIATE 1186 00:43:00,511 --> 00:43:03,314 ASSOCIATION IN BIOGRID, ETC.? 1187 00:43:03,314 --> 00:43:06,584 AND WHEN WE USED BOTH THESE 1188 00:43:06,584 --> 00:43:09,120 STRUCTURAL AND BIOLOGICAL FEATURES 1189 00:43:09,120 --> 00:43:12,423 TO INFORM, AND TRAIN A, 1190 00:43:12,490 --> 00:43:15,493 A CLASSIFIER, A RANDOM FOREST CLASSIFIER 1191 00:43:16,060 --> 00:43:17,929 TO BEST SEPARATE, 1192 00:43:17,929 --> 00:43:20,932 THE POSITIVES AND THE NEGATIVES, 1193 00:43:21,132 --> 00:43:23,868 WE ACTUALLY ENDED UP WITH A CLASSIFIER 1194 00:43:23,868 --> 00:43:25,269 THAT WAS REALLY QUITE USEFUL. 1195 00:43:25,269 --> 00:43:27,238 SO, IN THESE MACHINE 1196 00:43:27,238 --> 00:43:28,539 LEARNING APPROACHES, 1197 00:43:28,539 --> 00:43:32,043 ONE TYPICALLY TAKES 75% OF THE, 1198 00:43:32,343 --> 00:43:35,613 OF THE, OF THE DATA, THE, THE KNOWN 1199 00:43:35,980 --> 00:43:37,815 DATA POINTS TO TRAIN. 1200 00:43:37,815 --> 00:43:40,418 AND THEN ONE USES THE REMAINING 25% 1201 00:43:40,418 --> 00:43:42,253 TO ACTUALLY TEST PERFORMANCE. 1202 00:43:43,321 --> 00:43:44,889 AND, I WILL 1203 00:43:44,889 --> 00:43:48,025 JUST VERY BRIEFLY TELL YOU 1204 00:43:48,092 --> 00:43:51,762 THAT, GUIDED IN LARGE PART 1205 00:43:51,762 --> 00:43:54,298 BY SOME VERY CONSCIENTIOUS 1206 00:43:54,298 --> 00:43:58,102 AND VERY RIGOROUS, COMPUTATIONAL 1207 00:43:58,469 --> 00:44:00,872 BIOLOGY REVIEWERS OF OUR MANUSCRIPT, 1208 00:44:00,872 --> 00:44:02,106 WE PUT OUR CLASSIFIER 1209 00:44:02,106 --> 00:44:03,774 THROUGH ALL OF THE PACES, 1210 00:44:03,774 --> 00:44:04,675 AND, 1211 00:44:04,675 --> 00:44:05,710 AND DID 1212 00:44:05,710 --> 00:44:08,880 A LOT OF THE PERFORMANCE TESTING, 1213 00:44:08,880 --> 00:44:09,881 INCLUDING, 1214 00:44:09,881 --> 00:44:10,515 RUNNING 1215 00:44:10,515 --> 00:44:11,415 THESE ROC CURVES 1216 00:44:11,415 --> 00:44:13,618 AND MEASURING THE AREA UNDER THE CURVE, 1217 00:44:13,618 --> 00:44:16,654 WHICH, I HOPE YOU CAN SEE HERE, 1218 00:44:16,654 --> 00:44:18,523 SHOWS THAT OUR CLASSIFIER, 1219 00:44:18,523 --> 00:44:20,591 WHICH WE CALL SPOC, 1220 00:44:20,591 --> 00:44:21,859 PERFORMS, 1221 00:44:21,859 --> 00:44:22,994 REALLY QUITE WELL 1222 00:44:22,994 --> 00:44:24,629 AND MUCH BETTER THAN 1223 00:44:24,629 --> 00:44:26,964 THE PREVIOUS CONFIDENCE METRICS. 1224 00:44:26,964 --> 00:44:29,133 BUT I'LL SHOW YOU A MUCH MORE INTUITIVE 1225 00:44:29,133 --> 00:44:30,535 PERFORMANCE TEST IN A MOMENT. 1226 00:44:30,535 --> 00:44:34,605 BUT I WILL POINT OUT, THAT, WE CALL IT 1227 00:44:34,605 --> 00:44:35,573 WE CALL IT SPOC, 1228 00:44:35,573 --> 00:44:37,842 NOT ONLY BECAUSE IT'S A CUTE ACRONYM, 1229 00:44:37,842 --> 00:44:39,477 BUT BECAUSE REALLY 1230 00:44:39,477 --> 00:44:42,113 THE CLASSIFIER TAKES INTO CONSIDERATION 1231 00:44:42,113 --> 00:44:44,382 NOT ONLY THE STRUCTURAL FEATURES 1232 00:44:44,382 --> 00:44:45,616 OF THE PROTEIN PAIRS, 1233 00:44:45,616 --> 00:44:49,921 BUT ALSO, A LONG LIST OF OMICS 1234 00:44:50,087 --> 00:44:53,424 DERIVED BIOLOGICAL FEATURES OF EACH PAIR. 1235 00:44:53,424 --> 00:44:55,760 AND IT'S REALLY BY COMBINING, 1236 00:44:55,760 --> 00:44:59,130 BOTH OF THESE, THESE PROPERTIES. 1237 00:44:59,130 --> 00:44:59,864 SO CONSIDERING 1238 00:44:59,864 --> 00:45:02,433 NOT ONLY THE STRUCTURAL PLAUSIBILITY 1239 00:45:02,433 --> 00:45:05,436 OF THE STRUCTURE PREDICTION, 1240 00:45:05,570 --> 00:45:08,239 BUT ALSO THE BIOLOGICAL PROPERTIES 1241 00:45:08,239 --> 00:45:09,907 OF THE PROTEINS INVOLVED, 1242 00:45:09,907 --> 00:45:12,176 THAT WE WERE ABLE TO GENERATE 1243 00:45:12,176 --> 00:45:16,013 A CLASSIFIER, THAT PERFORMS REALLY WELL, 1244 00:45:16,013 --> 00:45:17,682 AND TO CONVINCE YOU THAT IT PERFORMS 1245 00:45:17,682 --> 00:45:19,383 WELL, I WILL REVISIT 1246 00:45:19,383 --> 00:45:20,885 THESE PROTEOME WIDE SCREENS 1247 00:45:20,885 --> 00:45:23,788 AND SHOW YOU THAT UNLIKE THE, 1248 00:45:23,788 --> 00:45:26,824 CLASSIC CONFIDENCE METRICS PDOCKQ AND IPTM, 1249 00:45:26,958 --> 00:45:28,659 WHICH REALLY SPREAD 1250 00:45:28,659 --> 00:45:30,861 THE FUNCTIONALLY RELEVANT PARTNERS 1251 00:45:30,861 --> 00:45:32,630 OVER A LARGE, 1252 00:45:32,630 --> 00:45:35,633 LIST OF PRESUMABLY FALSE POSITIVES, 1253 00:45:35,833 --> 00:45:36,434 SPARK 1254 00:45:36,434 --> 00:45:38,502 REALLY DOES AN AMAZING JOB OF PUTTING 1255 00:45:38,502 --> 00:45:40,371 THE FUNCTIONALLY RELEVANT PARTNERS 1256 00:45:40,371 --> 00:45:41,806 AT THE VERY TOP 1257 00:45:41,806 --> 00:45:44,809 OF THIS LIST OF 20,000 BINARY. 1258 00:45:44,942 --> 00:45:45,610 PREDICTIONS. 1259 00:45:45,610 --> 00:45:48,579 WE SEE THE SAME THING FOR DONSON. 1260 00:45:48,579 --> 00:45:51,983 SO HERE ARE THE FIVE OR SO, FUNCTIONAL. 1261 00:45:51,983 --> 00:45:52,483 RELEVANT 1262 00:45:52,483 --> 00:45:53,851 PARTNERS ARE SPREAD OVER 1263 00:45:53,851 --> 00:45:54,552 THE TOP 1264 00:45:54,552 --> 00:45:57,455 SEVEN HITS IN THIS PROTEOME WIDE SCREEN. 1265 00:45:57,455 --> 00:45:59,423 AND WE'VE ACTUALLY REPEATED, 1266 00:45:59,423 --> 00:46:01,559 VERSIONS OF THESE PROTEOME 1267 00:46:01,559 --> 00:46:04,595 WIDE SCREENS OVER, WE'VE DONE THIS 1268 00:46:04,762 --> 00:46:08,399 ABOUT 30 TIMES AND SPOC CONSISTENTLY 1269 00:46:08,699 --> 00:46:10,001 PERFORMS REALLY WELL, 1270 00:46:10,001 --> 00:46:11,836 INCLUDING ON PROTEINS 1271 00:46:11,836 --> 00:46:13,604 THAT ARE NOT IN THE DNA DAMAGE 1272 00:46:13,604 --> 00:46:16,941 RESPONSE, BUT, OPERATE IN THE CYTOPLASM, 1273 00:46:17,141 --> 00:46:18,542 AND VARIOUS DIFFERENT ORGANELLES. 1274 00:46:18,542 --> 00:46:19,810 AND THAT ACTUALLY MAKES SENSE 1275 00:46:19,810 --> 00:46:21,679 BECAUSE NOTHING ABOUT SPOC'S 1276 00:46:21,679 --> 00:46:25,516 TRAINING WAS ACTUALLY, REALLY PATHWAY 1277 00:46:25,516 --> 00:46:26,150 SPECIFIC. 1278 00:46:27,652 --> 00:46:28,185 OKAY. 1279 00:46:28,185 --> 00:46:31,789 SO NOW THAT WE HAVE A CLASSIFIER 1280 00:46:31,789 --> 00:46:34,291 THAT CAN ACTUALLY PICK OUT 1281 00:46:34,291 --> 00:46:37,061 FUNCTIONALLY RELEVANT 1282 00:46:37,061 --> 00:46:38,896 STRUCTURE PREDICTIONS 1283 00:46:38,896 --> 00:46:42,033 OUT OF A TRULY PROTEOME WIDE SCREEN, 1284 00:46:42,833 --> 00:46:43,601 IN PRINCIPLE, 1285 00:46:43,601 --> 00:46:44,635 WE'RE NOW IN A POSITION 1286 00:46:44,635 --> 00:46:47,538 TO ACTUALLY TRY TO GENERATE 1287 00:46:47,538 --> 00:46:50,875 A COMPREHENSIVE STRUCTURAL INTERACTOME. 1288 00:46:51,242 --> 00:46:53,444 AND SO THAT WOULD REALLY INVOLVE 1289 00:46:53,444 --> 00:46:56,414 TAKING ALL 20,000 HUMAN PROTEINS 1290 00:46:56,580 --> 00:46:59,583 AND PAIRING THEM UP WITH EACH OTHER, 1291 00:46:59,850 --> 00:47:02,753 FOLDING EACH PAIR. 1292 00:47:02,753 --> 00:47:04,522 AND THEN 1293 00:47:04,522 --> 00:47:07,324 APPLYING THE SPOC CLASSIFIER 1294 00:47:07,324 --> 00:47:08,659 THAT ACTUALLY WOULD INVOLVE 1295 00:47:08,659 --> 00:47:11,662 200 MILLION POSSIBLE PAIRS. 1296 00:47:12,463 --> 00:47:14,365 SOMETHING LIKE 0.1% OF 1297 00:47:14,365 --> 00:47:16,434 THESE ARE THOUGHT TO BE, 1298 00:47:16,434 --> 00:47:18,602 ACTUALLY FUNCTIONAL. 1299 00:47:18,602 --> 00:47:20,104 SO IT'S IT'S REALLY LOOKING 1300 00:47:20,104 --> 00:47:21,372 FOR A NEEDLE IN A HAYSTACK, 1301 00:47:21,372 --> 00:47:24,675 WHICH WE THINK, SPOC IS EQUIPPED TO DO. 1302 00:47:24,975 --> 00:47:27,144 HOWEVER, THERE IS OBVIOUSLY ONE 1303 00:47:27,144 --> 00:47:28,012 MAJOR BARRIER, 1304 00:47:28,012 --> 00:47:29,080 AND THAT IS THE COMPUTE 1305 00:47:29,080 --> 00:47:31,348 THAT WOULD BE NEEDED TO 1306 00:47:31,348 --> 00:47:34,785 TO COMPUTE THIS ENTIRE MATRIX. 1307 00:47:35,052 --> 00:47:36,253 SO THIS WOULD COST 1308 00:47:36,253 --> 00:47:37,788 TENS OF MILLIONS OF DOLLARS 1309 00:47:37,788 --> 00:47:40,791 IN, IN CLOUD COMPUTE COMPUTING. 1310 00:47:40,891 --> 00:47:43,694 AND SO WHAT WE AND OTHERS 1311 00:47:43,694 --> 00:47:47,131 ARE DOING TO MAKE THIS SORT 1312 00:47:47,131 --> 00:47:47,865 OF AN APPROACHABLE 1313 00:47:47,865 --> 00:47:52,002 PROBLEM IS TO INTRODUCE A TRIAGE STEP. 1314 00:47:52,002 --> 00:47:53,437 AND I'M HAPPY TO DISCUSS 1315 00:47:53,437 --> 00:47:54,839 HOW WE'RE DOING THIS. 1316 00:47:54,839 --> 00:47:57,842 BUT THE THE IDEA IS TO USE 1317 00:47:58,275 --> 00:47:59,977 MUCH LESS COMPUTATIONALLY 1318 00:47:59,977 --> 00:48:01,345 INTENSIVE APPROACHES 1319 00:48:01,345 --> 00:48:05,616 TO NOMINATE A SMALL SUBSET OF THE 200 1320 00:48:05,616 --> 00:48:07,017 MILLION PAIRS, 1321 00:48:07,017 --> 00:48:09,887 LET'S SAY THE TOP 1% OR 2 MILLION, 1322 00:48:09,887 --> 00:48:10,821 AND THEN SUBJECT 1323 00:48:10,821 --> 00:48:14,458 THOSE TO A FULL ANALYSIS WITH ALPHAFOLD 1324 00:48:14,725 --> 00:48:17,895 AND THE, THE SPOC CLASSIFIER. 1325 00:48:18,429 --> 00:48:20,131 AND WE'RE ACTUALLY GETTING, 1326 00:48:20,131 --> 00:48:21,699 VERY GENEROUS SUPPORT 1327 00:48:21,699 --> 00:48:24,702 FROM NVIDIA CORPORATION TO, 1328 00:48:24,702 --> 00:48:27,671 TO FOLD ABOUT 2 MILLION PAIRS, 1329 00:48:27,772 --> 00:48:31,275 WHICH WE HOPE WILL LEAD TO AT LEAST 1330 00:48:31,275 --> 00:48:33,344 A FIRST DRAFT 1331 00:48:33,344 --> 00:48:36,514 OF A REASONABLE STRUCTURAL INTERACTOME 1332 00:48:36,514 --> 00:48:37,715 THAT WE THINK, 1333 00:48:37,715 --> 00:48:39,784 TOGETHER WITH THE EFFORTS OF OTHER LABS 1334 00:48:39,784 --> 00:48:42,486 LIKE THE CONG LAB AT UT SOUTHWESTERN, 1335 00:48:42,486 --> 00:48:45,589 WILL, YOU KNOW, BEGIN TO GENERATE 1336 00:48:45,589 --> 00:48:48,192 A USEFUL STRUCTURAL INTERACTOME 1337 00:48:48,192 --> 00:48:49,126 THAT CAN CATALYZE 1338 00:48:49,126 --> 00:48:50,628 A LOT OF MECHANISTIC 1339 00:48:50,628 --> 00:48:51,996 DISCOVERY IN THE FIELD. 1340 00:48:53,030 --> 00:48:55,866 SO, UNTIL 1341 00:48:55,866 --> 00:48:58,869 WE HAVE THAT AND AND THAT'S, YOU KNOW, 1342 00:48:59,370 --> 00:49:01,572 GOING TO BE A LITTLE BIT OF, 1343 00:49:01,572 --> 00:49:03,140 IT'S GOING TO TAKE SOME TIME TO, 1344 00:49:03,140 --> 00:49:04,475 TO COMPLETE THE FOLDING. 1345 00:49:04,475 --> 00:49:06,577 WE'RE ABOUT IN THE MIDDLE OF THE RUN 1346 00:49:06,577 --> 00:49:07,511 RIGHT NOW. 1347 00:49:07,511 --> 00:49:09,413 AND IT'S GOING TO REQUIRE 1348 00:49:09,413 --> 00:49:10,781 QUITE A BIT OF ANALYSIS. 1349 00:49:10,781 --> 00:49:11,582 BUT IN THE MEANTIME, 1350 00:49:11,582 --> 00:49:13,150 THERE ARE ACTUALLY RESOURCES 1351 00:49:13,150 --> 00:49:16,987 THAT THAT YOU CAN UTILIZE, ALREADY. 1352 00:49:16,987 --> 00:49:20,191 SO THE PREDICTIVE, SORT RESOURCE, 1353 00:49:20,191 --> 00:49:22,593 WHICH IS REALLY DEDICATED TO, 1354 00:49:22,593 --> 00:49:24,995 THE GENOME MAINTENANCE PATHWAY. 1355 00:49:24,995 --> 00:49:26,463 AND I'D ALSO LIKE TO POINT OUT 1356 00:49:26,463 --> 00:49:27,998 THAT THE CONG LAB AT UT 1357 00:49:27,998 --> 00:49:32,102 SOUTHWESTERN HAS ACTUALLY ALREADY, 1358 00:49:32,102 --> 00:49:33,504 TAKEN A FIRST SWING 1359 00:49:33,504 --> 00:49:37,541 AT A STRUCTURAL INTERACTOME AND, 1360 00:49:37,541 --> 00:49:38,843 PUBLISHED A BIORXIV PAPER, 1361 00:49:38,843 --> 00:49:41,078 AND THEY HAVE THIS VERY NICE WEB RESOURCE 1362 00:49:41,078 --> 00:49:42,112 WHERE YOU CAN TYPE 1363 00:49:42,112 --> 00:49:43,714 IN YOUR FAVORITE PROTEIN, 1364 00:49:43,714 --> 00:49:47,284 AND THEY HAVE NOMINATED ABOUT 8000 NEW, 1365 00:49:47,585 --> 00:49:49,086 VERY HIGH QUALITY, HIGH 1366 00:49:49,086 --> 00:49:51,155 CONFIDENCE PROTEIN-PROTEIN INTERACTIONS. 1367 00:49:51,155 --> 00:49:52,389 WE'VE LOOKED AT THE DATABASE. 1368 00:49:52,389 --> 00:49:54,892 IT'S IT'S IT'S EXCELLENT. 1369 00:49:54,892 --> 00:49:58,195 BUT STILL OF COURSE, VERY INCOMPLETE. 1370 00:49:58,195 --> 00:50:00,164 BUT THIS IS DEFINITELY A STARTING POINT 1371 00:50:00,164 --> 00:50:02,867 FOR ALL OF YOU TO START LOOKING FOR 1372 00:50:02,867 --> 00:50:04,001 FUNCTIONALLY RELEVANT 1373 00:50:04,001 --> 00:50:06,103 INTERACTORS OF YOUR FAVORITE PROTEINS. 1374 00:50:06,103 --> 00:50:06,570 OF COURSE, 1375 00:50:06,570 --> 00:50:07,571 YOU CAN DO 1376 00:50:07,571 --> 00:50:10,674 YOUR OWN PROTEIN FOLDING EXPERIMENTS. 1377 00:50:10,941 --> 00:50:13,644 ESPECIALLY NOW THAT DEEPMIND 1378 00:50:13,644 --> 00:50:14,912 HAS MADE AVAILABLE, 1379 00:50:14,912 --> 00:50:17,214 THIS VERY USER FRIENDLY, 1380 00:50:17,214 --> 00:50:20,251 WEB INTERFACE WHERE YOU CAN, 1381 00:50:20,651 --> 00:50:22,820 USE THE ALPHAFOLD ALGORITHM 1382 00:50:22,820 --> 00:50:24,255 TO LOOK NOT ONLY FOR PROTEIN 1383 00:50:24,255 --> 00:50:25,222 PROTEIN INTERACTIONS, 1384 00:50:25,222 --> 00:50:26,891 BUT ALSO PROTEIN NUCLEIC ACID 1385 00:50:26,891 --> 00:50:28,826 AND PROTEIN LIGAND INTERACTIONS. 1386 00:50:28,826 --> 00:50:30,327 AND IF YOU HAVEN'T USED THIS ALREADY, 1387 00:50:30,327 --> 00:50:32,029 I STRONGLY ENCOURAGE YOU. 1388 00:50:32,029 --> 00:50:32,863 IT'S A BIT 1389 00:50:32,863 --> 00:50:36,267 LIMITED IN HOW MANY, EXPERIMENTS 1390 00:50:36,267 --> 00:50:37,468 YOU CAN DO, BUT YOU CAN 1391 00:50:37,468 --> 00:50:38,569 ACTUALLY, YOU KNOW, 1392 00:50:38,569 --> 00:50:39,003 IF YOU 1393 00:50:39,003 --> 00:50:39,570 IF YOU COLLECT 1394 00:50:39,570 --> 00:50:41,672 A COUPLE OF GOOGLE, ACCOUNTS 1395 00:50:41,672 --> 00:50:42,973 AND DO IT OVER A FEW DAYS, 1396 00:50:42,973 --> 00:50:44,541 YOU CAN EASILY FOLD A FEW HUNDRED 1397 00:50:44,541 --> 00:50:46,043 PAIRS AND DO A MINI SCREEN. 1398 00:50:47,144 --> 00:50:49,079 AND, YOU KNOW, IF THE WHOLE, 1399 00:50:49,079 --> 00:50:50,581 ALPHAFOLD BUSINESS 1400 00:50:50,581 --> 00:50:52,583 IS INTIMIDATING FOR YOU, 1401 00:50:52,583 --> 00:50:53,584 AND IF YOU'RE NOT SURE 1402 00:50:53,584 --> 00:50:55,152 ABOUT THE CONFIDENCE METRICS 1403 00:50:55,152 --> 00:50:57,254 AND SO FORTH, THE, 1404 00:50:57,254 --> 00:50:58,923 THE EUROPEAN BIOINFORMATICS 1405 00:50:58,923 --> 00:51:00,691 INSTITUTE HAS SOME VERY NICE 1406 00:51:00,691 --> 00:51:02,326 RESOURCES, INCLUDING, 1407 00:51:02,326 --> 00:51:04,361 A REALLY GREAT TUTORIAL 1408 00:51:04,361 --> 00:51:07,364 THAT THAT REALLY, GIVES, 1409 00:51:07,564 --> 00:51:09,366 A VERY NICE INTRODUCTION TO, 1410 00:51:09,366 --> 00:51:10,234 TO ALPHAFOLD 1411 00:51:10,234 --> 00:51:13,137 AND THE CONFIDENCE METRICS AND SO FORTH. 1412 00:51:13,137 --> 00:51:16,140 SO I'D LIKE TO END BY 1413 00:51:16,440 --> 00:51:19,209 REALLY JUST POINTING OUT THAT I THINK 1414 00:51:19,209 --> 00:51:21,111 WITH THE STRUCTURE PREDICTION REVOLUTION, 1415 00:51:21,111 --> 00:51:23,013 THERE IS A REALLY DRAMATIC 1416 00:51:23,013 --> 00:51:25,883 DEMOCRATIZATION OF STRUCTURAL BIOLOGY 1417 00:51:25,883 --> 00:51:27,618 THAT IS UNDERWAY. 1418 00:51:27,618 --> 00:51:29,420 I THINK UNTIL RECENTLY, 1419 00:51:29,420 --> 00:51:30,721 THERE WAS SORT OF THIS, 1420 00:51:30,721 --> 00:51:33,157 YOU KNOW, THIS FIREWALL BETWEEN, 1421 00:51:33,157 --> 00:51:34,892 THE PEOPLE WHO, 1422 00:51:34,892 --> 00:51:36,794 YOU KNOW, DO GENETICS AND CELL BIOLOGY 1423 00:51:36,794 --> 00:51:37,795 AND SOME BIOCHEMISTRY 1424 00:51:37,795 --> 00:51:41,432 AND THOSE WHO DO STRUCTURAL BIOLOGY AND 1425 00:51:41,432 --> 00:51:44,702 I THINK THAT WITH THE, 1426 00:51:45,202 --> 00:51:45,536 YOU KNOW, 1427 00:51:45,536 --> 00:51:46,804 THE ADVENT OF ALPHAFOLD 1428 00:51:46,804 --> 00:51:47,871 AND OTHER RELATED 1429 00:51:47,871 --> 00:51:49,573 ALGORITHMS, ALL OF US, 1430 00:51:49,573 --> 00:51:50,541 WHETHER WE'RE 1431 00:51:50,541 --> 00:51:54,478 GENETICISTS, CELL BIOLOGISTS, BIOCHEMISTS 1432 00:51:54,478 --> 00:51:55,813 OR SOMETHING ELSE, 1433 00:51:55,813 --> 00:51:57,781 WE CAN ALL NOW BEGIN 1434 00:51:57,781 --> 00:51:58,782 TO REALLY THINK 1435 00:51:58,782 --> 00:52:01,785 STRUCTURALLY ABOUT ALL OF THE PROBLEMS 1436 00:52:01,952 --> 00:52:03,420 THAT WE WORK ON. 1437 00:52:03,420 --> 00:52:04,555 AND, 1438 00:52:04,555 --> 00:52:05,222 SO I THINK, 1439 00:52:05,222 --> 00:52:06,323 I THINK THIS, THIS WALL 1440 00:52:06,323 --> 00:52:08,726 IS COMPLETELY BREAKING DOWN 1441 00:52:08,726 --> 00:52:11,495 AND, YOU KNOW, FOR MY LAB 1442 00:52:11,495 --> 00:52:12,830 AND I THINK FOR A LOT OF OTHER LABS 1443 00:52:12,830 --> 00:52:14,598 NOW, A REALLY TYPICAL 1444 00:52:14,598 --> 00:52:17,601 SORT OF WORKFLOW IS TO EITHER, 1445 00:52:17,868 --> 00:52:19,703 YOU KNOW, USE THE RESOURCES 1446 00:52:19,703 --> 00:52:21,805 THAT I MENTIONED A FEW SLIDES AGO, 1447 00:52:21,805 --> 00:52:25,175 OR ONE'S OWN STRUCTURE, PREDICTIONS TO, 1448 00:52:25,542 --> 00:52:28,612 TO IDENTIFY POTENTIAL NEW INTERACTIONS, 1449 00:52:28,946 --> 00:52:30,314 WHICH IN MANY CASES 1450 00:52:30,314 --> 00:52:32,416 LEAD TO A COMPELLING HYPOTHESIS 1451 00:52:32,416 --> 00:52:34,952 THAT IS IMMEDIATELY TESTABLE, 1452 00:52:34,952 --> 00:52:35,686 YOU KNOW, THROUGH SITE 1453 00:52:35,686 --> 00:52:37,521 DIRECTED MUTAGENESIS, 1454 00:52:37,521 --> 00:52:39,923 AND STRUCTURE FUNCTION ANALYSIS 1455 00:52:39,923 --> 00:52:40,624 WITH, OF COURSE, 1456 00:52:40,624 --> 00:52:43,627 THE ULTIMATE GOAL ALWAYS BEING TO CONFIRM 1457 00:52:43,627 --> 00:52:44,862 THESE PREDICTIONS 1458 00:52:44,862 --> 00:52:46,697 USING STRUCTURE DETERMINATION. 1459 00:52:46,697 --> 00:52:48,432 BUT WHAT'S WHAT'S SO POWERFUL 1460 00:52:48,432 --> 00:52:49,166 AND SO BEAUTIFUL 1461 00:52:49,166 --> 00:52:50,000 ABOUT THE STRUCTURE 1462 00:52:50,000 --> 00:52:51,969 PREDICTION, REVOLUTION 1463 00:52:51,969 --> 00:52:56,273 IS THAT WE CAN ALL, THINK STRUCTURALLY 1464 00:52:56,273 --> 00:53:00,110 AND WE CAN LONG BEFORE WE HAVE 1465 00:53:00,244 --> 00:53:01,578 THE REAGENTS 1466 00:53:01,578 --> 00:53:02,413 OR THE RESULTS 1467 00:53:02,413 --> 00:53:03,180 OF AN EXPERIMENTAL 1468 00:53:03,180 --> 00:53:04,481 STRUCTURE DETERMINATION, 1469 00:53:04,481 --> 00:53:07,384 WE CAN ACTUALLY, START, 1470 00:53:07,384 --> 00:53:09,686 FORMULATING AND TESTING HYPOTHESES. 1471 00:53:09,686 --> 00:53:11,155 AND, AND I ALSO WANT TO SAY 1472 00:53:11,155 --> 00:53:12,089 THAT, YOU KNOW, FAR 1473 00:53:12,089 --> 00:53:13,357 FROM SOMEHOW 1474 00:53:13,357 --> 00:53:15,092 PUTTING STRUCTURAL BIOLOGISTS 1475 00:53:15,092 --> 00:53:15,926 OUT OF BUSINESS, 1476 00:53:15,926 --> 00:53:16,894 I THINK IT'S IN, 1477 00:53:16,894 --> 00:53:17,561 IN MANY WAYS 1478 00:53:17,561 --> 00:53:19,897 HAVING THE OPPOSITE EFFECT BECAUSE, 1479 00:53:19,897 --> 00:53:21,999 SO MANY MORE GROUPS CAN NOW, 1480 00:53:22,966 --> 00:53:24,968 FORMULATE 1481 00:53:24,968 --> 00:53:27,404 STRUCTURAL HYPOTHESES, YOU KNOW, 1482 00:53:27,404 --> 00:53:31,875 FEASIBLE STRUCTURAL HYPOTHESES THAT THAT, 1483 00:53:32,309 --> 00:53:32,976 WE'RE NOW 1484 00:53:32,976 --> 00:53:34,845 TALKING TO THE STRUCTURAL BIOLOGISTS 1485 00:53:34,845 --> 00:53:37,681 ABOUT AND GETTING THEIR HELP TESTING. 1486 00:53:37,681 --> 00:53:39,183 SO, SO FOR ME, 1487 00:53:39,183 --> 00:53:39,450 YOU KNOW, 1488 00:53:39,450 --> 00:53:41,485 IT IS REALLY DRAMATICALLY STRENGTHENED 1489 00:53:41,485 --> 00:53:43,020 MY INTERACTION 1490 00:53:43,020 --> 00:53:45,089 WITH WITH LOCAL STRUCTURE 1491 00:53:45,089 --> 00:53:47,491 DETERMINATION GROUPS. 1492 00:53:47,491 --> 00:53:48,492 I DON'T WANT TO LEAVE YOU 1493 00:53:48,492 --> 00:53:50,260 WITH TWO ROSY OF A PICTURE. 1494 00:53:50,260 --> 00:53:53,564 I WILL POINT OUT THAT THAT ALPHAFOLD 1495 00:53:53,564 --> 00:53:54,765 IS NOT INFALLIBLE. 1496 00:53:54,765 --> 00:53:55,866 THERE ARE LOTS OF FALSE 1497 00:53:55,866 --> 00:53:57,367 NEGATIVE INTERACTIONS 1498 00:53:57,367 --> 00:53:59,403 AND STILL SOME FALSE POSITIVES. 1499 00:53:59,403 --> 00:54:02,606 THERE ARE ERRORS IN IN RESIDUE 1500 00:54:02,606 --> 00:54:03,774 AND DOMAIN POSITIONING 1501 00:54:03,774 --> 00:54:06,810 AND OF COURSE, A LIMITED SORT OF, 1502 00:54:07,044 --> 00:54:10,347 SURVEY OF THE CONFORMATIONAL DYNAMICS. 1503 00:54:10,347 --> 00:54:12,249 SO ALL OF THIS MEANS THAT, 1504 00:54:12,249 --> 00:54:13,150 AND TOGETHER WITH, 1505 00:54:13,150 --> 00:54:13,984 WITH WHAT I SAID 1506 00:54:13,984 --> 00:54:14,852 JUST A MOMENT AGO, 1507 00:54:14,852 --> 00:54:15,853 ALL OF THIS MEANS THAT 1508 00:54:15,853 --> 00:54:17,454 THAT EXPERIMENTAL STRUCTURE 1509 00:54:17,454 --> 00:54:19,256 DETERMINATION IS STILL REALLY 1510 00:54:19,256 --> 00:54:21,658 AN ABSOLUTELY ESSENTIAL, 1511 00:54:21,658 --> 00:54:25,229 PART OF OF DERIVING, 1512 00:54:25,395 --> 00:54:26,864 A DEEP MECHANISTIC 1513 00:54:26,864 --> 00:54:28,332 UNDERSTANDING OF ALL THE PROCESSES 1514 00:54:28,332 --> 00:54:29,333 THAT WE CARE ABOUT. 1515 00:54:29,333 --> 00:54:30,400 SO I'M GOING TO STOP THERE 1516 00:54:30,400 --> 00:54:33,103 AND THANK THE PEOPLE WHO DID THE WORK. 1517 00:54:33,103 --> 00:54:34,404 SO TYCHO, 1518 00:54:34,404 --> 00:54:36,240 REALLY FABULOUS POSTDOC 1519 00:54:36,240 --> 00:54:37,875 WHO IS ON THE JOB MARKET. 1520 00:54:37,875 --> 00:54:43,580 DID ALL THE EXPERIMENTAL, WORK ON TC-NER 1521 00:54:43,580 --> 00:54:46,583 ERNST REALLY LED THE WAY ON THE, 1522 00:54:46,817 --> 00:54:48,218 STRUCTURE PREDICTION 1523 00:54:48,218 --> 00:54:50,154 SIDE WITH HELP FROM FROM HELEN 1524 00:54:50,154 --> 00:54:51,955 AND JACK, TWO UNDERGRADUATES. 1525 00:54:51,955 --> 00:54:53,857 AND I WANT TO THANK MY AMAZING 1526 00:54:53,857 --> 00:54:56,093 COLLABORATOR LUCAS AND HIS TRAINEES 1527 00:54:57,194 --> 00:54:58,896 FOR, JUST BLOWING 1528 00:54:58,896 --> 00:55:00,931 OUR MINDS WITH WITH HOW QUICKLY 1529 00:55:00,931 --> 00:55:03,867 THEY CAN, ACTUALLY GENERATE 1530 00:55:03,867 --> 00:55:06,870 EXPERIMENTAL STRUCTURE. 1531 00:55:07,237 --> 00:55:10,274 DETERMINATION TO TO HELP US TEST 1532 00:55:10,440 --> 00:55:11,909 WHETHER WE'RE ON THE RIGHT TRACK 1533 00:55:11,909 --> 00:55:14,878 WITH OUR, WITH OUR PREDICTIONS. 1534 00:55:14,878 --> 00:55:17,281 AND I THINK I'M GOING TO STOP THERE AND, 1535 00:55:17,281 --> 00:55:18,916 JUST THANK MY FUNDING SOURCES, 1536 00:55:18,916 --> 00:55:19,349 OF COURSE, 1537 00:55:19,349 --> 00:55:21,251 AND GIVE ANOTHER SHOUT OUT TO NVIDIA 1538 00:55:21,251 --> 00:55:25,556 FOR GENEROUSLY, DONATING GPU TIME TO, 1539 00:55:26,490 --> 00:55:28,959 START, START ZIPPING THROUGH, 1540 00:55:28,959 --> 00:55:31,128 THESE MANY PREDICTIONS THAT WE WANT TO, 1541 00:55:31,128 --> 00:55:32,563 TO GENERATE TO, 1542 00:55:32,563 --> 00:55:35,566 TO WORK TOWARDS A STRUCTURAL INTERACTOME. 1543 00:55:36,967 --> 00:55:37,701 ALL RIGHT. 1544 00:55:37,701 --> 00:55:39,770 SO, I'LL STOP THERE. 1545 00:55:39,770 --> 00:55:42,773 I GUESS I'LL STOP SHARING. 1546 00:55:44,808 --> 00:55:45,909 THANKS SO MUCH, JOHANNES. 1547 00:55:45,909 --> 00:55:48,912 THAT WAS WONDERFUL. 1548 00:55:49,279 --> 00:55:49,846 JIM. THANK YOU. 1549 00:55:49,846 --> 00:55:51,748 CHUNZHANG DO YOU WANT TO START WITH THE QUESTIONS. 1550 00:55:51,748 --> 00:55:52,482 YEAH, SURE. 1551 00:55:52,482 --> 00:55:54,284 SO I'LL GO THROUGH THE, 1552 00:55:54,284 --> 00:55:56,353 QUESTIONS IN THE CHAT BOX. 1553 00:55:56,353 --> 00:55:59,523 SO WE HAVE A QUESTION FROM, MARCELLA. 1554 00:55:59,590 --> 00:56:00,757 SAGAR. 1555 00:56:00,757 --> 00:56:01,725 WHAT IS THE 1556 00:56:01,725 --> 00:56:03,460 ADVANTAGE OF USING ALPHAFOLD 1557 00:56:03,460 --> 00:56:04,194 AND MULTIMODE, 1558 00:56:04,194 --> 00:56:06,563 TO IDENTIFY 1559 00:56:06,563 --> 00:56:08,298 PROTEIN-PROTEIN INTERACTIONS. 1560 00:56:08,298 --> 00:56:09,299 ALPHAFOLD MODEL, 1561 00:56:09,299 --> 00:56:12,102 POLYMER ASSUME THE PROTEIN IS UNFOLDED 1562 00:56:12,102 --> 00:56:13,637 WITHOUT ITS PROTEIN 1563 00:56:13,637 --> 00:56:14,471 BINDING PARTNER 1564 00:56:14,471 --> 00:56:16,840 AND FOLDED INVOLVES INTERACTION 1565 00:56:16,840 --> 00:56:18,408 WITH THE BAIT PROTEIN. 1566 00:56:18,408 --> 00:56:19,810 COULD YOU IDENTIFY A NOVEL 1567 00:56:19,810 --> 00:56:21,078 BINDING PARTNER ASSUMING 1568 00:56:21,078 --> 00:56:22,913 THE PROTEIN IS ALREADY FOLDING 1569 00:56:22,913 --> 00:56:24,748 AND THE OTHER FOLDING PROTEIN 1570 00:56:24,748 --> 00:56:25,782 WITH THE BAIT PROTEIN? 1571 00:56:28,218 --> 00:56:30,220 I'M NOT SURE I 1572 00:56:30,220 --> 00:56:33,457 ENTIRELY UNDERSTAND THE QUESTION. 1573 00:56:33,924 --> 00:56:36,860 I GUESS 1574 00:56:36,860 --> 00:56:40,430 I'LL SAY THAT IN MOST CASES, BOTH, 1575 00:56:41,064 --> 00:56:43,700 YOU KNOW, BOTH THE BOTH THE BAIT AND 1576 00:56:43,700 --> 00:56:47,871 AND THE PREY PROTEINS WILL FOLD, 1577 00:56:48,605 --> 00:56:50,774 YOU KNOW, BOTH THERMODYNAMICALLY 1578 00:56:50,774 --> 00:56:54,878 AS WELL AS WITH ALPHAFOLD ON THEIR OWN. 1579 00:56:56,446 --> 00:56:59,449 THERE ARE INSTANCES SO 1580 00:56:59,816 --> 00:57:02,052 SO THERE ARE INTRINSICALLY DISORDERED 1581 00:57:02,052 --> 00:57:03,920 REGIONS, AS EVERYONE KNOWS. 1582 00:57:03,920 --> 00:57:05,856 AND A LOT OF THESE CONTAIN 1583 00:57:05,856 --> 00:57:09,026 SHORT LINEAR MOTIFS, SHORT PEPTIDES 1584 00:57:09,192 --> 00:57:10,861 THAT ARE USUALLY HIGHLY CONSERVED 1585 00:57:10,861 --> 00:57:12,763 BUT EMBEDDED WITHIN OTHERWISE 1586 00:57:12,763 --> 00:57:14,197 DISORDERED REGIONS. 1587 00:57:14,197 --> 00:57:17,601 AND, WE OFTEN SEE 1588 00:57:17,701 --> 00:57:19,603 THAT THESE ARE UNFOLD 1589 00:57:19,603 --> 00:57:21,004 FOLDED ON THEIR OWN. 1590 00:57:21,004 --> 00:57:21,972 BUT WHEN WE FOLD THEM 1591 00:57:21,972 --> 00:57:23,473 WITH A FUNCTIONALLY RELEVANT PARTNER 1592 00:57:23,473 --> 00:57:27,010 THAT THEY NOW ADOPT, A HELICAL FOLD. 1593 00:57:27,644 --> 00:57:28,945 SO, SO THAT IS, 1594 00:57:28,945 --> 00:57:29,346 I THINK 1595 00:57:29,346 --> 00:57:30,814 PROBABLY THE MOST PROMINENT 1596 00:57:30,814 --> 00:57:34,518 SORT OF SITUATION IN WHICH THE, YOU KNOW, 1597 00:57:34,518 --> 00:57:35,319 THE, THE, 1598 00:57:35,319 --> 00:57:36,586 THE ACT OF ACTUALLY 1599 00:57:36,586 --> 00:57:39,289 BRINGING TWO PARTNERS TOGETHER 1600 00:57:39,289 --> 00:57:40,457 INDUCES FOLDING, 1601 00:57:40,457 --> 00:57:41,591 BUT OTHERWISE, 1602 00:57:41,591 --> 00:57:44,594 THAT THAT IS TYPICALLY NOT THE CASE. 1603 00:57:44,594 --> 00:57:46,663 BUT BUT I'M NOT SURE I ENTIRELY 1604 00:57:46,663 --> 00:57:48,098 UNDERSTOOD THE QUESTION, BUT I HOPE, 1605 00:57:48,098 --> 00:57:49,533 HOPEFULLY I GOT AT LEAST PART OF IT. 1606 00:57:53,203 --> 00:57:54,504 I'LL ASK THE NEXT ONE. 1607 00:57:54,504 --> 00:57:57,641 WEI YANG (HI WEI) ASKS 1608 00:57:58,842 --> 00:57:59,142 IS 1609 00:57:59,142 --> 00:58:01,478 STK19 IS PREDICTED TO INTERACT WITH MSH2 1610 00:58:01,478 --> 00:58:04,481 TO MUCH TO JUST BELOW CSA AND XPD 1611 00:58:04,715 --> 00:58:05,582 WHAT IS KNOWN OF 1612 00:58:05,582 --> 00:58:08,585 STK19 AND MISMATCH REPAIR? 1613 00:58:09,152 --> 00:58:11,655 NOTHING TO MY KNOWLEDGE. 1614 00:58:11,655 --> 00:58:13,824 AND, 1615 00:58:13,824 --> 00:58:16,927 I DON'T REMEMBER THE EXACT SPOC SCORE, 1616 00:58:16,927 --> 00:58:17,494 BUT I THINK 1617 00:58:17,494 --> 00:58:17,994 I THINK THERE WAS 1618 00:58:17,994 --> 00:58:20,397 A BIT OF A DROP OFF IN TERMS OF, 1619 00:58:20,397 --> 00:58:21,832 YOU KNOW, THE, 1620 00:58:21,832 --> 00:58:25,302 THE ACTUAL CONFIDENCE VALUE. 1621 00:58:27,371 --> 00:58:28,872 BUT, YOU KNOW, 1622 00:58:28,872 --> 00:58:29,706 IT'S POSSIBLE, 1623 00:58:29,706 --> 00:58:32,042 I DON'T THINK ANYBODY HAS LOOKED AT 1624 00:58:32,042 --> 00:58:35,112 WHETHER THERE IS A FUNCTIONAL INTERPLAY 1625 00:58:35,112 --> 00:58:38,815 OF STK19 WITH MISMATCH REPAIR. 1626 00:58:39,383 --> 00:58:42,386 AND THIS IS, YOU KNOW, THIS IS WHERE 1627 00:58:43,186 --> 00:58:46,423 THERE IS OBVIOUSLY OFTEN A GRAY ZONE, 1628 00:58:46,923 --> 00:58:48,158 WITH THIS STRUCTURES 1629 00:58:48,158 --> 00:58:49,926 PREDICTION BUSINESS, THE 1630 00:58:49,926 --> 00:58:51,661 I WOULD SAY THE MOST, 1631 00:58:51,661 --> 00:58:55,232 YOU KNOW, COMPELLING CASES ARE 1632 00:58:55,232 --> 00:58:57,901 WHEN YOU GET A PREDICTION 1633 00:58:57,901 --> 00:58:59,136 THAT REALLY GIVES YOU 1634 00:58:59,136 --> 00:59:00,604 A EUREKA MOMENT, RIGHT? 1635 00:59:00,604 --> 00:59:03,306 AS IN THE CASE OF STK19 1636 00:59:03,306 --> 00:59:06,676 INTERACTING WITH CSA AND XPD, 1637 00:59:07,077 --> 00:59:08,378 TWO FACTORS 1638 00:59:08,378 --> 00:59:10,514 ALREADY KNOWN TO BE IN THE SAME PATHWAY. 1639 00:59:10,514 --> 00:59:12,582 THAT'S THAT'S REALLY KIND OF A NO BRAINER 1640 00:59:12,582 --> 00:59:13,116 WHERE YOU 1641 00:59:13,116 --> 00:59:14,084 YOU LOOK AT THAT PREDICTION, 1642 00:59:14,084 --> 00:59:14,951 YOU SAY, 1643 00:59:14,951 --> 00:59:17,220 YOU KNOW THAT THAT IS 1644 00:59:17,220 --> 00:59:18,722 THAT MAKES SO MUCH SENSE 1645 00:59:18,722 --> 00:59:20,957 THAT I'M GOING TO SPEND 1646 00:59:20,957 --> 00:59:23,427 THE TIME IT TAKES TO TEST THIS IDEA. 1647 00:59:24,961 --> 00:59:27,230 IN OTHER INSTANCES, IT CAN BE 1648 00:59:27,230 --> 00:59:30,400 OBVIOUSLY A MUCH RISKIER PROPOSITION. 1649 00:59:30,834 --> 00:59:32,602 AND SO IN THE CASE OF STK 1650 00:59:32,602 --> 00:59:35,605 19 AND MISMATCH REPAIR, YOU KNOW, I CAN'T 1651 00:59:35,772 --> 00:59:37,441 I REALLY CAN'T COMMENT, 1652 00:59:37,441 --> 00:59:40,143 ON WHETHER THAT'S WORTH PURSUING OR NOT 1653 00:59:40,143 --> 00:59:41,478 OR WHETHER THERE'S ANY, 1654 00:59:41,478 --> 00:59:44,080 ANY PRIOR EVIDENCE. 1655 00:59:44,080 --> 00:59:47,083 WE CERTAINLY DON'T HAVE ANY. 1656 00:59:47,918 --> 00:59:48,251 OKAY. 1657 00:59:48,251 --> 00:59:51,221 THE NEXT QUESTION IS FROM DMITRI GORDENIN. 1658 00:59:51,655 --> 00:59:52,389 WOULD YOU, 1659 00:59:52,389 --> 00:59:53,890 WOULD YOUR PLASMID SYSTEM 1660 00:59:53,890 --> 00:59:55,525 BE APPLICABLE TO NON-BULKY 1661 00:59:55,525 --> 00:59:56,693 BASED LESIONS 1662 00:59:56,693 --> 00:59:59,696 IMPEDING RNA POLYMERASE? 1663 01:00:01,097 --> 01:00:04,100 RIGHT. 1664 01:00:04,901 --> 01:00:07,871 I THINK I THINK IN PRINCIPLE, YES. 1665 01:00:08,071 --> 01:00:08,839 AND IT'S 1666 01:00:08,839 --> 01:00:12,342 ACTUALLY BRINGS UP AN INTERESTING POINT, 1667 01:00:13,276 --> 01:00:15,745 WHICH IS THAT THERE ARE PRESUMABLY 1668 01:00:15,745 --> 01:00:16,246 THERE ARE 1669 01:00:16,246 --> 01:00:16,847 AND I'M SURE 1670 01:00:16,847 --> 01:00:17,814 THERE ARE PROBABLY OTHER PEOPLE 1671 01:00:17,814 --> 01:00:18,281 ON THE CALL 1672 01:00:18,281 --> 01:00:19,916 WHO KNOW MORE ABOUT THIS THAN I DO, 1673 01:00:19,916 --> 01:00:23,019 BUT THERE ARE PROBABLY LESIONS, SUCH 1674 01:00:23,019 --> 01:00:26,022 AS ABASIC SITES THAT WOULDN'T TRIGGER 1675 01:00:26,256 --> 01:00:28,859 THE GLOBAL GENOMIC REPAIR PATHWAY, 1676 01:00:28,859 --> 01:00:31,728 BUT THAT WOULD ABSOLUTELY STALL 1677 01:00:31,728 --> 01:00:32,929 RNA POLYMERASE. 1678 01:00:34,130 --> 01:00:35,632 BUT THEN I GUESS BUT THEN I GUESS 1679 01:00:35,632 --> 01:00:36,266 THE ISSUE IS 1680 01:00:36,266 --> 01:00:39,269 THAT THEY MAY NOT STALL THE XPD HELICASE. 1681 01:00:39,436 --> 01:00:43,707 SO, YEAH, IT'S AN INTERESTING QUESTION. 1682 01:00:43,707 --> 01:00:45,675 WE'VE NOT DONE ANYTHING BEYOND 1683 01:00:45,675 --> 01:00:48,812 USING THE CISPLATIN 1,3-G13 1684 01:00:49,379 --> 01:00:51,481 INTRA STRAND CROSS-LINKED. 1685 01:00:51,481 --> 01:00:52,482 BUT CERTAINLY 1686 01:00:52,482 --> 01:00:55,585 THE SYSTEM LENDS ITSELF TO EXPLORING 1687 01:00:56,520 --> 01:01:00,724 MORE WIDELY WHAT RANGE OF LESIONS 1688 01:01:00,724 --> 01:01:03,760 WOULD ACTUALLY LEAD TO PRODUCTIVE, 1689 01:01:03,760 --> 01:01:06,763 TECHNIQUE FOR. 1690 01:01:07,631 --> 01:01:07,898 OKAY. 1691 01:01:07,898 --> 01:01:09,432 DMITRI HAS A FOLLOW UP QUESTION 1692 01:01:09,432 --> 01:01:11,501 ON A COMPLETELY DIFFERENT TOPIC 1693 01:01:11,501 --> 01:01:12,702 HOW MUCH ALPHAFOLD BASED 1694 01:01:12,702 --> 01:01:14,004 PIPELINES ARE APPLICABLE 1695 01:01:14,004 --> 01:01:16,273 TO PREDICTION OF SNP EFFECTS PER SE? 1696 01:01:16,273 --> 01:01:16,973 WELL INTEGRATED 1697 01:01:16,973 --> 01:01:18,308 WITH EXISTING BIOINFORMATICS 1698 01:01:18,308 --> 01:01:21,311 PREDICTION METHODS, POLYPHEN, ETC.. 1699 01:01:21,912 --> 01:01:22,178 RIGHT? 1700 01:01:22,178 --> 01:01:22,512 I THINK WELL, 1701 01:01:22,512 --> 01:01:26,416 I THINK THE UNDERLYING QUESTION HERE IS, 1702 01:01:26,416 --> 01:01:29,419 WHETHER YOU CAN USE ALPHAFOLD TO, 1703 01:01:29,619 --> 01:01:33,056 YOU KNOW, MODEL POINT MUTATIONS. 1704 01:01:33,490 --> 01:01:37,460 AND BY AND LARGE, ALPHAFOLD REALLY 1705 01:01:38,495 --> 01:01:39,963 IGNORES IF YOU. 1706 01:01:39,963 --> 01:01:41,531 SO IF YOU INTRODUCE, YOU KNOW, 1707 01:01:42,499 --> 01:01:45,502 A FEW MUTATIONS, 1708 01:01:45,869 --> 01:01:48,872 IN YOUR, YOUR BAIT OR YOUR PREY PROTEIN, 1709 01:01:49,139 --> 01:01:50,974 ALPHAFOLD WILL LARGELY IGNORE THAT. 1710 01:01:50,974 --> 01:01:51,942 AND THE REASON FOR THAT IS, 1711 01:01:51,942 --> 01:01:53,009 OF COURSE, THAT 1712 01:01:53,009 --> 01:01:55,946 WHAT IT DOES IS IT IT CONSIDERS, 1713 01:01:55,946 --> 01:01:56,580 YOU KNOW, 1714 01:01:56,580 --> 01:01:59,583 A WHOLE MULTIPLE SEQUENCE ALIGNMENT. 1715 01:01:59,583 --> 01:02:01,484 SO IT IT ALWAYS LOOKS FOR, 1716 01:02:01,484 --> 01:02:03,153 YOU KNOW, AS MANY, 1717 01:02:03,153 --> 01:02:05,188 ORTHOLOGS AS IT CAN, 1718 01:02:05,188 --> 01:02:06,556 AND THEN IT BUILDS, 1719 01:02:06,556 --> 01:02:08,692 YOU KNOW, MULTIPLE SEQUENCE ALIGNMENT. 1720 01:02:08,692 --> 01:02:10,393 IT LOOKS FOR, FOR COVARIATION. 1721 01:02:10,393 --> 01:02:13,463 AND SO IF, IF, IF YOU TWEAK ONE 1722 01:02:14,230 --> 01:02:16,366 PARTICULAR MEMBER OF THE MSA, 1723 01:02:16,366 --> 01:02:18,335 IT REALLY HAS NO EFFECT. 1724 01:02:18,335 --> 01:02:19,903 BUT OF COURSE THERE ARE 1725 01:02:19,903 --> 01:02:21,838 THERE ARE ALGORITHMS 1726 01:02:21,838 --> 01:02:24,074 THAT HAVE BEEN DEVELOPED 1727 01:02:24,074 --> 01:02:26,810 TO, ADDRESS THAT VERY THING. 1728 01:02:26,810 --> 01:02:32,415 AND ALPHA ALPHA MISSENSE IS PROBABLY, 1729 01:02:32,849 --> 01:02:33,850 YOU KNOW, THE BEST, 1730 01:02:33,850 --> 01:02:37,087 I THINK STILL RESOURCE TO ACTUALLY, 1731 01:02:37,754 --> 01:02:40,924 GET A SENSE OF WHETHER A MUTATION IS, IS 1732 01:02:40,924 --> 01:02:42,258 DELETERIOUS OR NEUTRAL. 1733 01:02:45,128 --> 01:02:47,664 NOW THE NEXT QUESTION IS FROM ROB SCOTT. 1734 01:02:47,664 --> 01:02:48,832 HELLO. 1735 01:02:48,832 --> 01:02:50,634 THANKS VERY MUCH FOR THE TALK. 1736 01:02:50,634 --> 01:02:51,401 AND, 1737 01:02:51,401 --> 01:02:53,970 HE'S WORKING IN, ALL ATOM 1738 01:02:53,970 --> 01:02:56,373 MOLECULAR DYNAMICS SIMULATION, 1739 01:02:56,373 --> 01:02:58,141 AND HE'S LOOKING FOR APPLICATION 1740 01:02:58,141 --> 01:02:59,476 IN DNA REPAIR. 1741 01:02:59,476 --> 01:03:00,343 AND HE ASK, 1742 01:03:00,343 --> 01:03:01,845 CAN YOU PLEASE TELL 1743 01:03:01,845 --> 01:03:03,780 IS THERE SOME PARTICULAR PROBLEM 1744 01:03:03,780 --> 01:03:05,348 SEE, YOU COULD RECOMMEND THAT 1745 01:03:05,348 --> 01:03:08,351 THAT HE WORK ON. 1746 01:03:10,453 --> 01:03:11,154 AND FOR AN ALL 1747 01:03:11,154 --> 01:03:14,157 ATOM SIMULATION. 1748 01:03:15,258 --> 01:03:16,393 GOSH. 1749 01:03:16,393 --> 01:03:17,127 WELL, 1750 01:03:17,127 --> 01:03:19,062 I MEAN, I THINK, 1751 01:03:19,062 --> 01:03:21,865 I THINK AS SOON AS YOU START TO MODEL, 1752 01:03:21,865 --> 01:03:23,533 YOU KNOW, NOT ONLY PROTEINS, BUT, 1753 01:03:23,533 --> 01:03:27,137 BUT OTHER, YOU KNOW, LIGANDS, 1754 01:03:27,637 --> 01:03:30,140 YOU HAVE TO START TO MOVE INTO AN ALL 1755 01:03:30,140 --> 01:03:33,410 ATOM, SORT OF SETTING. 1756 01:03:33,410 --> 01:03:34,811 AND OF COURSE, 1757 01:03:34,811 --> 01:03:35,578 THAT'S A LOT OF 1758 01:03:35,578 --> 01:03:37,881 WHAT ALPHAFOLD THREE IS DOING. 1759 01:03:37,881 --> 01:03:39,716 BUT I DON'T 1760 01:03:39,716 --> 01:03:41,718 I DON'T THINK I HAVE A REALLY GOOD 1761 01:03:41,718 --> 01:03:44,721 SPECIFIC, RECOMMENDATION FOR YOU. 1762 01:03:45,121 --> 01:03:46,222 I'M HAPPY TO CORRESPOND 1763 01:03:46,222 --> 01:03:47,891 WITH YOU OVER THAT. AND IF YOU WANT. 1764 01:03:49,926 --> 01:03:50,627 THERE'S A 1765 01:03:50,627 --> 01:03:51,761 QUESTION FROM 1766 01:03:51,761 --> 01:03:52,929 SOMEBODY THAT JUST LABELED 1767 01:03:52,929 --> 01:03:53,797 WOODGATES LAB. 1768 01:03:53,797 --> 01:03:54,664 I DON'T KNOW IF IT'S ROGER 1769 01:03:54,664 --> 01:03:56,999 OR SOMEBODY ELSE SAYING NICE. 1770 01:03:56,999 --> 01:03:58,601 NICE TALK AND DETAIL. 1771 01:03:58,601 --> 01:03:59,803 SMILEY FACE. 1772 01:03:59,803 --> 01:04:00,970 DID YOU CHECK THE CURRENT 1773 01:04:00,970 --> 01:04:02,005 ACTIVITY OF STK 1774 01:04:02,005 --> 01:04:03,406 19 IN THE COMPLEX, 1775 01:04:03,406 --> 01:04:06,242 OR IS IT ACTING MORE AS AN ADAPTER? 1776 01:04:06,242 --> 01:04:09,612 WELL, SO SO I CERTAINLY HAVE NOT 1777 01:04:09,612 --> 01:04:12,115 WE'VE NOT LOOKED AT, KINASE ACTIVITY, 1778 01:04:12,115 --> 01:04:13,450 BUT REALLY, I MEAN, 1779 01:04:13,450 --> 01:04:14,117 IT DOESN'T LOOK 1780 01:04:14,117 --> 01:04:16,686 LIKE ANYTHING LIKE A KINASE. 1781 01:04:16,686 --> 01:04:18,188 SO IT WOULD BE TRULY SHOCKING 1782 01:04:18,188 --> 01:04:19,456 IF IT HAD KINASE ACTIVITY. 1783 01:04:19,456 --> 01:04:20,924 BUT I THINK, YES, 1784 01:04:20,924 --> 01:04:23,893 ACTUALLY, PUBLISHED A PAPER 1785 01:04:24,127 --> 01:04:25,695 WHERE HE ADDRESSED 1786 01:04:25,695 --> 01:04:26,496 NOT ONLY THE MIS- 1787 01:04:26,496 --> 01:04:27,797 ANNOTATION OF STK19, 1788 01:04:27,797 --> 01:04:28,965 BUT ALSO, 1789 01:04:28,965 --> 01:04:30,233 YOU KNOW, 1790 01:04:30,233 --> 01:04:31,968 THE FACT THAT IT'S NOT A KINASE. 1791 01:04:31,968 --> 01:04:33,436 AND HE DID NOT LOOK AT IT 1792 01:04:33,436 --> 01:04:35,839 IN THE CONTEXT OF A COMPLEX, 1793 01:04:35,839 --> 01:04:37,540 BUT CERTAINLY IN ISOLATION, 1794 01:04:37,540 --> 01:04:39,876 AND FOUND NO EVIDENCE OF KINASE ACTIVITY. 1795 01:04:39,876 --> 01:04:41,344 AND IT WOULD BE VERY SHOCKING 1796 01:04:41,344 --> 01:04:44,347 IF IT SUDDENLY BECAME A KINASE, 1797 01:04:44,414 --> 01:04:47,150 WHEN IT WHEN IT BINDS TO THE TC-NER COMPLEX 1798 01:04:49,085 --> 01:04:52,422 AND WE HAVE A MESSAGE FROM, FROM, 1799 01:04:52,422 --> 01:04:53,690 ANDREW JACKSON 1800 01:04:53,690 --> 01:04:56,593 AND HE SAY, HI, YOHANNES IN SPOC 1801 01:04:56,593 --> 01:04:57,727 DID YOU LOOK AT, 1802 01:04:57,727 --> 01:04:59,162 WHICH PARAMETER GAVE MOST 1803 01:04:59,162 --> 01:05:00,463 PREDICTIVE VALUE? 1804 01:05:00,463 --> 01:05:03,433 IF SO, WHICH WERE MOST INFORMATIVE? 1805 01:05:03,600 --> 01:05:05,735 RIGHT. YEAH, THAT'S A GREAT QUESTION. 1806 01:05:05,735 --> 01:05:08,972 SO I'M TRYING TO REMEMBER 1807 01:05:08,972 --> 01:05:11,975 FROM THE STRUCTURAL FEATURES, 1808 01:05:12,709 --> 01:05:16,479 IT WAS ACTUALLY, 1809 01:05:16,479 --> 01:05:18,214 I BELIEVE IT WAS A METRIC 1810 01:05:18,214 --> 01:05:20,750 THAT WE CALL AVERAGE MODELS. 1811 01:05:20,750 --> 01:05:23,386 SO, SO ONE OF THE HOMEGROWN METRICS 1812 01:05:23,386 --> 01:05:24,420 THAT WE CAME UP WITH 1813 01:05:24,420 --> 01:05:27,957 WAS SIMPLY TO QUANTIFY THE AGREEMENT 1814 01:05:28,057 --> 01:05:30,760 AMONG THE FIVE INDEPENDENTLY 1815 01:05:30,760 --> 01:05:32,562 TRAINED ALPHAFOLD MODELS. 1816 01:05:32,562 --> 01:05:34,597 SO PROBABLY MANY OF YOU KNOW 1817 01:05:34,597 --> 01:05:37,600 THAT THAT ALPHAFOLD WAS TRAINED ON FIVE, 1818 01:05:38,568 --> 01:05:41,571 SLIGHTLY DIFFERENT, REGIMES. 1819 01:05:41,571 --> 01:05:43,706 AND SO WHEN YOU GIVE IT A PREDICTION 1820 01:05:43,706 --> 01:05:45,408 WHERE YOU GIVE IT A PROTEIN 1821 01:05:45,408 --> 01:05:46,242 OR A PROTEIN PAIR, 1822 01:05:46,242 --> 01:05:48,511 IT ACTUALLY SPITS OUT FIVE SOLUTIONS. 1823 01:05:48,511 --> 01:05:50,747 AND AND SO WHAT WE FOUND 1824 01:05:50,747 --> 01:05:52,282 WAS THAT THE GREATER 1825 01:05:52,282 --> 01:05:52,949 THE AGREEMENT 1826 01:05:52,949 --> 01:05:53,917 AMONG THOSE, 1827 01:05:53,917 --> 01:05:57,287 FIVE MODELS, THE MORE LIKELY 1828 01:05:57,854 --> 01:05:59,155 SOMETHING IS TO BE TRUE. 1829 01:05:59,155 --> 01:05:59,989 AND WE ACTUALLY HAVE 1830 01:05:59,989 --> 01:06:01,591 THAT AS A FEATURE IN SPOC. 1831 01:06:01,591 --> 01:06:03,059 SO, SO ON THE STRUCTURAL SIDE, 1832 01:06:03,059 --> 01:06:05,895 THAT IS ACTUALLY ONE OF THE BEST FEATURES 1833 01:06:05,895 --> 01:06:08,998 ON THE BIOLOGICAL SIDE, IT TURNS OUT 1834 01:06:08,998 --> 01:06:12,001 ACTUALLY TO BE BIOGRID. 1835 01:06:12,235 --> 01:06:14,804 SO AS YOU KNOW, BIOGRID, 1836 01:06:14,804 --> 01:06:16,406 YOU KNOW, IT CURATES 1837 01:06:16,406 --> 01:06:19,542 NOT ONLY THE LITERATURE BUT ALSO, 1838 01:06:19,809 --> 01:06:22,745 YOU KNOW, A LOT OF PROTEOMIC DATABASES 1839 01:06:22,745 --> 01:06:26,115 AND SO, PERHAPS NOT SURPRISINGLY, BIO 1840 01:06:26,115 --> 01:06:29,686 GRID IS IN FACT, THE MOST, 1841 01:06:30,453 --> 01:06:34,257 IMPORTANT METRIC ON THE BIOLOGICAL SIDE. 1842 01:06:34,958 --> 01:06:37,961 BUT IT IS BY NO MEANS IT IS BY NO MEANS, 1843 01:06:38,394 --> 01:06:41,397 YOU KNOW, SUFFICIENT. 1844 01:06:41,664 --> 01:06:44,601 AND, AND AND WE ACTUALLY SEE 1845 01:06:44,601 --> 01:06:47,704 MANY EXAMPLES OF PROTEIN PAIRS 1846 01:06:47,704 --> 01:06:49,606 LIKE STK19-XPD, 1847 01:06:49,606 --> 01:06:51,474 WHICH ARE NOT REPORTED IN BIOGRID. 1848 01:06:51,474 --> 01:06:54,043 THAT INTERACTION IS COMPLETELY UNKNOWN. 1849 01:06:54,043 --> 01:06:56,713 YET THEY SCORE SUPER HIGHLY. 1850 01:06:56,713 --> 01:06:59,682 SO SO, YOU KNOW, IT'S A COMPLEX 1851 01:06:59,682 --> 01:07:00,917 IT'S A COMPLEX PICTURE. 1852 01:07:01,918 --> 01:07:04,420 SO, YOU KNOW, THE CRISPR SCREENS AND, 1853 01:07:04,420 --> 01:07:05,088 YOU KNOW, 1854 01:07:05,088 --> 01:07:06,489 CO-EXPRESSION 1855 01:07:06,489 --> 01:07:08,291 ALL OF THOSE THINGS CONTRIBUTE. 1856 01:07:08,291 --> 01:07:08,825 AND SO 1857 01:07:08,825 --> 01:07:09,559 AND SO IT'S 1858 01:07:09,559 --> 01:07:11,160 REALLY A SUM OF MANY PARTS 1859 01:07:11,160 --> 01:07:12,161 THAT GIVES YOU THE, 1860 01:07:12,161 --> 01:07:13,463 THE ULTIMATE PERFORMANCE. 1861 01:07:13,463 --> 01:07:16,032 THAT IS SPOC. 1862 01:07:16,032 --> 01:07:16,599 OKAY. 1863 01:07:16,599 --> 01:07:18,167 ROB SCOTT JUST SAYS THANKS. 1864 01:07:18,167 --> 01:07:18,835 HE WAS GUY 1865 01:07:18,835 --> 01:07:19,402 WITH THE QUESTION 1866 01:07:19,402 --> 01:07:20,703 ABOUT WHAT PROBLEMS TO WORK ON. 1867 01:07:20,703 --> 01:07:22,005 MAYBE THIS IS SOMETHING YOU COULD FOLLOW 1868 01:07:22,005 --> 01:07:25,508 UP, OFFLINE, BUT KEN WANTS TO ANSWER 1869 01:07:25,508 --> 01:07:27,310 ASK A QUESTION KEN. 1870 01:07:27,310 --> 01:07:28,978 WHAT A WONDERFUL PRESENTATION. 1871 01:07:28,978 --> 01:07:30,546 I REALLY ENJOYED THAT. 1872 01:07:30,546 --> 01:07:33,549 IF SDK19 IS INVOLVED IN 1873 01:07:33,716 --> 01:07:35,418 DNA REPAIR AND TRANSCRIPTION 1874 01:07:35,418 --> 01:07:36,519 COUPLED REPAIR. 1875 01:07:36,519 --> 01:07:38,388 AND WE KNOW THAT THE 1876 01:07:38,388 --> 01:07:41,491 MUTATIONS IN THAT, CAUSE HUMAN DISEASES 1877 01:07:42,258 --> 01:07:45,428 ARE, THE TRANSCRIPTION COUPLED REPAIR 1878 01:07:46,362 --> 01:07:47,897 ABNORMALITIES SEEM TO INVOLVE 1879 01:07:47,897 --> 01:07:49,766 MORE NEUROLOGIC DEGENERATION 1880 01:07:49,766 --> 01:07:51,401 RATHER THAN CANCER. 1881 01:07:51,401 --> 01:07:53,136 I HAVE YOU BEEN ABLE 1882 01:07:53,136 --> 01:07:55,471 TO LOOK AT ANY OF THESE NEW LARGE 1883 01:07:55,471 --> 01:07:57,140 GENOME DATABASES 1884 01:07:57,140 --> 01:07:58,274 TO SEE IF THERE ARE 1885 01:07:58,274 --> 01:08:00,176 ANY ABNORMALITIES IN STK 1886 01:08:00,176 --> 01:08:03,379 19 THAT READ HUMAN DISEASE OR TO CATCH. 1887 01:08:04,047 --> 01:08:06,182 YEAH. THAT'S A IT'S A TERRIFIC QUESTION. 1888 01:08:06,182 --> 01:08:08,351 PERSONALLY WE'VE NOT DONE THAT. 1889 01:08:08,351 --> 01:08:10,420 I BET, 1890 01:08:10,420 --> 01:08:11,921 I BET SOME OF THE OTHER GROUPS 1891 01:08:11,921 --> 01:08:15,024 WORKING ON STK19 THAT ARE MORE 1892 01:08:15,024 --> 01:08:17,126 GENETICALLY ORIENTED THAN WE ARE, 1893 01:08:17,126 --> 01:08:18,661 HAVE PROBABLY DONE THAT. 1894 01:08:18,661 --> 01:08:19,929 AND I IMAGINE 1895 01:08:19,929 --> 01:08:22,498 THAT THE ABSENCE OF, OF ANY, 1896 01:08:22,498 --> 01:08:26,202 STATEMENTS OR REPORTS IN, IN ANY OF THE, 1897 01:08:26,602 --> 01:08:30,640 PAPERS, 1898 01:08:30,640 --> 01:08:32,742 SUGGESTS THAT SO FAR AT LEAST, 1899 01:08:32,742 --> 01:08:34,744 THERE ARE NO PATIENT MUTATIONS. 1900 01:08:34,744 --> 01:08:36,245 AND I NOW REALIZE THAT I, 1901 01:08:36,245 --> 01:08:37,914 I THINK I MAY HAVE ACTUALLY, 1902 01:08:37,914 --> 01:08:41,217 NEGLECTED TO MENTION THE OTHER GROUPS. 1903 01:08:41,684 --> 01:08:42,618 I THINK I HAD THEIR, 1904 01:08:42,618 --> 01:08:44,654 THEIR REFERENCES ON MY SLIDE, 1905 01:08:44,654 --> 01:08:46,389 BUT I DO WANT TO POINT OUT THAT, 1906 01:08:46,389 --> 01:08:50,026 I THINK IT'S 1907 01:08:50,326 --> 01:08:53,296 THREE OTHER GROUPS ALSO REPORTED 1908 01:08:53,296 --> 01:08:56,899 THE ROLE OF STK19 IN TC-NER 1909 01:08:57,133 --> 01:08:58,935 AND, AND A NUMBER OF THOSE, 1910 01:08:58,935 --> 01:09:00,870 ALSO PROVIDED EVIDENCE 1911 01:09:00,870 --> 01:09:03,006 FOR THE INTERACTION WITH XPD, 1912 01:09:04,273 --> 01:09:05,441 IN ALL CASES BASED ON 1913 01:09:05,441 --> 01:09:08,411 THE ALPHAFOLD PREDICTION. 1914 01:09:11,414 --> 01:09:12,982 I DON'T SEE ADDITIONAL QUESTIONS. 1915 01:09:12,982 --> 01:09:16,686 SO I THINK WE'LL JUST, STOP HERE 1916 01:09:16,686 --> 01:09:18,755 AND I WOULD LIKE TO TAKE THIS OPPORTUNITY 1917 01:09:18,755 --> 01:09:20,590 TO THANK DOCTOR WALTER 1918 01:09:20,590 --> 01:09:21,858 FOR SUCH A FASCINATING 1919 01:09:21,858 --> 01:09:23,693 AND INSIGHTFUL PRESENTATION 1920 01:09:23,693 --> 01:09:24,894 AND THE GROUNDBREAKING WORK THAT 1921 01:09:24,894 --> 01:09:25,928 CLEARLY ADVANCED 1922 01:09:25,928 --> 01:09:27,697 UNDERSTANDING FOR TRANSCRIPTION 1923 01:09:27,697 --> 01:09:29,399 COUPLED WITH DNA REPAIR. 1924 01:09:29,399 --> 01:09:31,134 I WOULD ALSO LIKE TO THANK EVERYONE 1925 01:09:31,134 --> 01:09:31,634 FOR JOINING 1926 01:09:31,634 --> 01:09:33,936 TODAY'S SESSION AND THANK YOU ALL. 1927 01:09:33,936 --> 01:09:35,071 AND WE ARE LOOKING FORWARD 1928 01:09:35,071 --> 01:09:36,873 TO SEE YOU IN THE FUTURE EVENTS. 1929 01:09:36,873 --> 01:09:38,441 THANK YOU. HAVE A GREAT DAY. 1930 01:09:38,441 --> 01:09:39,542 THANK YOU, DOCTOR WALTER. 1931 01:09:39,542 --> 01:09:42,545 YEAH, THANK YOU ALL FOR LISTENING. 1932 01:09:42,912 --> 01:09:43,646 YEAH. BYE BYE.