>> OKAY. WHAT A NICE AUDIENCE. I WOULD LIKE TO WELCOME YOU TO TODAY'S NIH DIRECTOR LEKSURE SERIES. OUR SPEAKER IS DR. TERRY PRZYTYCKA. TERRY HAS A VERY INTERESTING BACKGROUND. SHE WAS A MATHEMATICIAN AS UNDERGRAD AT WARSAU UNIVERSITY, THEN GOT A Ph.D. IN COMPUTER SCIENCE IN CANADA, UNIVERSITY OF BRITISH COLUMBIA. SUBSEQUENTLY WAS ON THE FACULTY, AS ASSISTANT PROFESSOR AT UNIVERSITY OF CALIFORNIA RIVER SIDE, THEN MOVED TO THE UNIVERSITY OF SOUTHERN DENMARK. THEN CAME BACK TO THE UNITED STATES WHERE SHE WAS IN THE BIOPHYSICS DEPARTMENT AT HOPKINS. BEFORE WE WERE VERY FORTUNATE TO RECRUIT HER TO COME TO THE NIH. THE NLM NCBI COMPUTATIONAL BIOLOGY BRANCH WHERE SHE'S RECENTLY TENURED AS SENIOR INVESTIGATOR. SHE USES MATHEMATICAL MODELS TO UNDERSTAND BIOLOGICAL PROCESSES. THINKS FAIRLY BROADLY ABOUT THESE THINGS, INCLUDING USE OF GRAPH THEORY AND ALGORITHMIC KINDS OF APPROACHES BUT NOT BY ANY MEANS LIMITED TO THOSE. JUDGING BY THE AUDIENCE, THERE'S LOTS OF PEOPLE AT NIH INTERESTED IN WHAT SHE DOES. THE TALK TODAY IS CALLED TODAY'S BRIDGING -- TOWARDS BRIDGING THE GENOTYPE PHENOTYPE GAP, SYSTEMS LEVEL DISSECTION OF IMPACT OF VARIATION IN DNA SEQUENCE AND STRUCTURE ON GENE EXPRESSION. TERRY. >> THANK YOU FOR THE INTRODUCTION. IT'S A PLEASURE TO BE SPEAKING HERE TODAY TO SHARE WITH YOU THE WORK THAT WE DO IN MY GROUP. I WILL MENTION MY GROUP IS PURELY COMPUTATIONAL SO WE APPLY VARIOUS COMPUTATIONAL TECHNIQUES THAT TO SOLVE QUESTIONS THAT ARRIVE IN MICROBIOLOGY. SO TODAY I HAVE A PROBLEM THAT GENERALLY EVOLVES AROUND THE QUESTION OF UNDERSTANDING THE RELATIONSHIP WITH GENOTYPE AN PHENOTYPE. SO (INDISCERNIBLE) GENOTYPE PHENOTYPE TYPICALLY STARTS WITH ASSOCIATION STUDIES. IN WHEN WE TRY TO IDENTIFY GENETIC LOCATION IN THE GENOMES VARIABILITY IN THIS PARTICULAR LOCATION CORRELATES WITH WITH THE PHENOTYPIC VARIABILITY. WE CAN IDENTIFY A PLACE LIKE THAT, IT -- SOME SUGGESTED THAT IT IS IN THE GENERE GENERAL THIS MIGHT BE A GENE THAT DIRECTLY OR INDIRECTLY AFFECTS THE PHENOTYPE. SO THIS IS JUST A FIRST STEP, IN ORDER TO UNDERSTAND THE RELATIONSHIP BETTER WE NEED TO GO DEEPER AND WE NEED TO UNDERSTAND NOT ONLY TO IDENTIFY THE ASSOCIATION BUT TO BE ABLE TO POINT -- PINPOINT SPECIFIC GENES OR REGULATORY ELEMENTS OR SPECIFIC MUTATIONS THAT ARE RESPONSIBLE FOR VARIATION. WE NEED TO UNDERSTAND THE MECHANISM OF ACTION, WHAT EXACTLY IT DOES TO THE MUSCLE IT'S EMBEDDED IN. WE TRY ALSO TO UNDERSTAND HOW THE AFFECT OF THIS ACTUAL MUTATION OF THE PERTURBATION AFFECT OTHER MOLECULES AND OTHER PATHWAYS. LAST BUT NOT LEAST WE HAVE TO KEEP IN MIND THAT MOST OF THE PHENOTYPES OF INFECTS, MOST OF THE DISEASES, MOST ARE COMPLEX PHENOTYPE AND WE SHOULDN'T EXPECT THERE TO BE ONE TO ONE RELATIONSHIP, JUST ONE GENETIC PHENOTYPE. INSTEAD COMPLEX DISEASES ARE CHARACTERIZED AND COMPLEX PHENOTYPE BY THE FACT THEY'RE AFFECTED BY A NUMBER OF GENETIC LOCATION, WHICH JOINTLY AFFECT THE PHENOTYPE AND THOSE AFFECTS DO INTERACT WITH OTHER EPISODIC WAYS WHICH MAKES IT DIFFICULT TO UNCOVER THOSE RELATIONSHIPS. IN THE LAST FEW YEARS MY GROUP HAS BEEN WORKING ON ALL THOSE PROBLEMS AND ONE THING THAT'S IMPORTANT TO KEEP IN MIND TO ASK A DIVERSE QUESTION, WE NEED TO LOOK AT THE MOLECULAR SYSTEM WITH VARIOUS LEVEL OBSTRUCTION AND VARIOUS LEVEL OF DETAIL. SO TODAY I ORGANIZE MY PRESENTATION WHILE LOOKING AT THE QUESTION OF OF CLOSING ENGINE TYPE PHENOTYPE PATHWAY. ON A DIFFERENT LEVEL WE LOOK AT THE STRUCTURE AND SEQUENCE LEVEL, WE TRY TO UNDERSTAND HOW THEY THE DYNAMICS ARE MOST DIRECT CHANGES AND I'LL SAY A LITTLE BIT ABOUT OUR WORK THAT RELATES TO CONFIRMATIONAL CHANGENING DNA STRUCTURE AN RNA STRUCTURE. THEN WE WILL NEED TO UNDERSTAND HOW THE PERTURBATIONS PROPAGATE THROUGH THE SYSTEM, WE TAKE A NETWORK LEVEL APPROACH AND HERE I'M GOING TO DISCUSS OUR (INDISCERNIBLE) TO START PROPAGATION OF THE VARIATION, GENETIC VARIATION THROUGH THE SYSTEM IN THE CONTEXT OF CANCER HOW DOES VARIATION DISREGULATE SPECIFIC MOLECULAR PATHWAYS AND HOW IT CAN USE COMPUTATIONAL APPROACHES TO SUGGEST OR PREDICT THE GENES MUTATIONS THAT ARE CAUSAL FOR THOSE DISREGULATIONS. FINALLY I WILL RESUME FARTHER TAKE ASSOCIATION TYPE APPROACH, SHOW HOW WE CAN USING ASSOCIATION UNCOVER EPISODIC INTERACTION WHICH CAN GIVE US QUITE INTERESTING INSIGHT INTO BIOLOGY. I WILL DISCUSS THIS IN THE CONTEXT OF EPISTATIC INTERACTION RELATED TO DRUG RESISTANCE. I WOULD LIKE TO START WITH CONFIRMATIONAL CHANGES IN DNA STRUCTURE. IN COLLABORATION WITH DAVID LEVIN'S GROUP IN NCI AND (INDISCERNIBLE). I WOULD LIKE TO START BY ACKNOWLEDGING THE COMPUTATIONAL PART OF THIS WORK AND ALSO FOR THE RESPECTIVELY WHO DID THE EXPERIMENTS. THE CONONCAL STRUCTURE OF DNA IS THE DOUBLE HELIX. HOWEVER, THERE'S AMPLE EVIDENCE IN VITRO THAT DNA MAY ALSO ASSUME AUTOMATIC STRUCTURES AS THE MOST PROMINENT ARE QUAW QUAW SINGLE STRANDED DNA. BUT ALSO A NUMBER OF OTHER STRUCTURES THAT CAN BE ASSUMED BY DNA. NOW, SO THOSE THE STRENGTH OF EVIDENCE IN IN VITRO BUT ALSO IN VIVO THERE ARE SOME INDICATIONS AND -- THAT THOSE CONFIRMATIONAL CHANGES MAY HAVE REGULATORY RESULTS. THEY MAY IMPACT THE GENES THAT ARE ALMOST IN THE NEIGHBORHOOD OF THE STRUCTURE AS PROPOSED BYUTHEM 12 OR AS DOCUMENTED BY THE PAPER DR. DAVID'S GROUP THEY CAN ACTUALLY AFFECT GENES THAT ARE AS FAR AS 1 KB FROM THE PERTURBATION. SO OF THOSE ISOLATED EXAMPLES THIS IS SOME TYPE OF REGULATION MORE BROADLY TO -- SO IN ORDER TO UNDERSTAND WHETHER IT WAS THIS TYPE OF ONE HAS TO SOMEHOW BE ABLE TO IDENTIFY THOSE STRUCTURES IN VIVO. SO IN ORDER TO DO THIS, WE USE THE EXPERIMENTAL GROUP IN DAVID LEVIN'S LAB WHICH CAPPIZE ON OBSERVATION OF ALL OF THOSE STRUCTURES ARE ACCOMPANIED WITH SOME BASIS. THE PROCEDURE IS TO UNCOVER THOSE BASES, WE REFER TO AS SINGLE STANDARD -- SINGLE STRANDED DNA 6 TO IMPORTANT PART FOR US IS TO UNDERSTAND THAT AS THE (INAUDIBLE) RECOGNIZING UNPAIRED BASIS IN VIVO IS THE FIRST STEP. LAST STEP IS TO SEQUENCE FRAGMENTS ADJACENT TO THE UNPAIR BASIS AND MARK THEM TO REFERENCE GENES. NOW, THIS TELLS -- SOMETHING THAT TELLS US WHAT IRPAIR BASIS WERE BUT IT DOESN'T TELL US WHETHER THOSE HAVE ANYTHING TO DO WITH THE AUTOMATIC STRUCTURE. HERE IS WHERE COMPUTATIONAL COMPONENTS COME IN AN THOSE STRUCTURES CANNOT BE FORM IN IN VITRO IN DNA BUT THE STEP SEQUENCE MOTIF OR CERTAIN PROPERTIES OF DNA THAT ALLOW FORMATION OF THE STRUCTURES. THOSE SEQUENCE DEPENDENT PROPERTIES CAN BE IDENTIFIED COMPUTATIONALLY. SO WITH THIS WE CAN PREDICT PREDICTABLY WHERE DNA SEQUENCE MAY FORM A STRUCTURE WITH EXPERIMENTAL DATA. >> YOU CAN LOOKING AT OVERLAP BETWEEN EXPERIMENTAL DATA AND THE PREDICTION, YOU CAN SEE IN THE ISLAND ARE IN REACH IN RESTING STATE. AS HE EXPECTED IT BECOMES PRELIMINARY ACTIVITY LEAVE SOME UNPAIRED BASIS, LESS OBVIOUSLY BUT IMPORTANTLY ALSO SO THAT GIVES US SOME COMPLIMENT ON UNPAIRED BASIS RELATED TO THIS AUTOMATIC STRUCTURE. HOWEVER WE WANTED TO HAVE A LITTLE BIT STRONGER EVIDENCE, THOUGH IT'S TRUE ALL OF THOSE STRUCTURES DO -- ARE CONFIDENT WITH THOSE UNPAIRED BASIS BUT THE CONTEXT UNPAIRED BASIS OCCUR DIFFERENT FOR EACH STRUCTURES. HERE WE HAVE TWO SMALL OPENINGS SEPARATED BY THE RIGHT HANDED HELIX AND HERE WE HAVE MORE COMPLICATED STRUCTURES. SINCE THOSE OPENINGS ARE DIFFERENT CONTEXT YOU SHOULD SEE DIFFERENT SEQUENCE PATTERN WHEN WE MAP ADJACENT FRAGMENTS. HERE IS THE THERAPEUTIC CONSIDERATION FOR THIS, WHAT WE SHOULD EXPECT IS REMEMBER WE SEQUENCED ADJACENT FRAGMENTS SO WE SHOULD SEE ON BOTH ENDS, WE SHALL SEE A PEAK OF FIVE HERE FOLLOWED BY A PEAK OF THREE PRIME STACK. THEN IT SHOULD BE DEPLETION OF TAK BECAUSE WE DONE HAVE FROM THIS REGION AN PATTERN SO THIS IS THE PICTURE WE WE SHOULD SEE. OR IN PRACTICE THIS IS WHAT WE SEE IN PRACTICE. HOW ABOUT OTHER STRUCTURES? THE B DNA, THIS IS RIGHT -- (INDISCERNIBLE) WE HAVE UNPAIRED BASES BETWEEN IN THE TRANSITION BETWEEN B TO C JUNCTION. AGAIN, I'M NOT GOING TO REPEAT THE EXERCISE WE DID BUT YOU CAN AGAIN PREDICT WHAT WAS THE RISK PATTERN AND COMPOUND WIT THE EXPERIMENTAL DATA AND SEE THAT IT REALLY AGREES. SO WHAT I SHOULD MENTION, WHEN WE LOOK AT -- AT PARTICULAR SEQUENCE IN DNA WHICH IS -- WHICH HAS PREDICTED AUTOMATIC STRUCTURE, WE DO NOT SEE TOO MUCH -- THIS IS CONSISTENT WITH THE UNDERSTANDING THOSE STRUCTURES IF THEY EXIST ARE REALLY -- TO CLARIFY THIS IS NOT AROUND ONE SINGLE PREDICTED SITE BUT INSTEAD PULL THE SITES TOGETHER, ALIGN THEM IN THE MIDDLE AND THIS IS A PROFILE OF ALL THE PREDICTED FILES. OKAY. SO THIS KIND OF SUGGESTS THAT THOSE ARE ENRICHMENT OF SINGLE CELL DNA IN PREDICTED AND NON-DNA STRUCTURE THAT UNDERSTANDS THE DISTRIBUTION WITH THE THERAPEUTIC PREDICTION, THOSE STRONGLY SUGGEST ABUNDANT OCCURRENCES OF NON-DNA STRUCTURES IN VIVO. SO THIS IS A WORK IN PROGRESS, WE HAVE LOTS OF OTHER QUESTIONS TO ANSWER, WHAT EXACTLY DOES IT PLAY ROLE, HOW THEY INTERACT WITH OTHER ELEMENTS, HOW DO THEY INTERACT WITH -- DEPENDS ON CELL TYPE AN CELL STATE, THOSE ARE ALL QUESTIONS THAT WE NEED TO ADDRESS. NOW, LET ME SWITCH FROM -- GEARS AND MOVE FROM CONFIRMATIONAL CHANGES IN DNA STRUCTURE TO CONFIRMATIONAL CHANGES IN RNA STRUCTURES. (INDISCERNIBLE) WHO HAS BEEN LEADING THIS PART OF OUR WORK. SO IN THIS PARTICULAR CASE WE WERE VERY INTERESTED IN UNDERSTANDING HOW SINGLE NUCLEOTIDE POLYMORPHISM MIGHT AFFECT MESSENGER RNA STRUCTURE. WHY IS THIS INTERESTING? IT'S QUITE WELL DOCUMENTED THAT MESSENGER RNA STRUCTURE AND FIVE PRIME UTR MIGHT HAVE A REGULATORY ROLE. IT MAY AFFECT DEPRIVATION RATE, IT MAY AFFECT THE TRANSLATION INITIATION, BUT ALSO STRUCTURAL CHANGES MAY ALSO BE IMPORTANT AS WE PROPOSE THAT THEY MAY HAVE IMPACT ON SPLICING AND CAN CHANGE THE KINETICS OF TRANSLATION WHICH IN THEORY COULD LEAD TO CHANGES IN KINETICS OF FOLDING OF THE PROTEIN CHAIN AND PERHAPS POTENTIAL PROTEIN FOLDING. SO WE WANTED TO DEVELOP A (INAUDIBLE) THAT ALLOW US TO ASSESS STRUCTURAL DIFFERENCES THAT ARE BY SINGLE NUCLEOTIDE POLYMORPHISM. PERHAPS YOU'RE AWARE THAT THERE ARE WELL ESTABLISHED COMPUTATIONAL PROGRAMS THAT PREDICT A MINIMUM OF MESSENGER RNA. SO WHY DON'T WE TAKE THOSE PROGRAMS, NATIVE SEQUENCE MINIMAL FOR THE -- AND SEE WHAT ARE THE DIFFERENCE. WE DECIDED THIS IS NOT A CORRECT WAY TO APPROACH THIS PROBLEM. AND DEVELOPMENT IS NOT CORRECT BECAUSE THOSE THINGS ARE SO SIMILAR TO EACH OTHER. SO FROM THE PROSPECT, I WILL TRY TO CONVINCE YOU FROM A PRACTICAL PERSPECTIVE YOU HAVE TO TAKE A VIEW OF THE ASSEMBLY OF FACTORSCH YOU LOOK AT THIS PROTEIN -- AS THE DNA SEQUENCE -- MESSENGER RNA SEQUENCE AND THIS SEQUENCE MAY ACTUALLY ASSUME A NUMBER OF FACTORS AND HERE IS KIND OF THIS STRUCTURE VISUALIZED BY THIS CAN (INDISCERNIBLE) THE PROBABILITY OF OF ASSUMING STRUCTURES DIRECTLY DEFINED BY THE THE FREE ENERGY OF THE STRUCTURE, AND AT A MINIMUM WILL BE ASSUMED WITH THE HIGHEST PROBABILITY. BUT OTHER FACTORS MIGHT BE ASSUMED THAT'S CORRELATED TO THE MINIMUM. COMPUTATIONAL EXPERIMENTS SHOW WHEN YOU LOOK AT THE STRUCTURE AND THE NEXT BEST STRUCTURE, THE ENERGY DIFFERENCE IS REALLY VERY SMALL. SUGGESTING NOT ONLY MINIMUM STRUCTURES LIKELY BUT ALSO STRUCTURES SHOULD BE CONSIDERED. SO NOW WE HAVE TO NOT JUST COMPARE TO A MINIMAL STRUCTURE BUT ASSEMBLY OF STRUCTURESCH THAT'S NEW SO WE NEED TO COME UP WITH A WAY OF MEASURING THE DISCOUNT BETWEEN TWO ASSEMBLY STRUCTURES AND ALSO COCOME UP WITH THE COMPUTATIONAL AND COMPUTE THE DISTANCE. NOW IT IS NOT HARD TO COME UP WITH A MEASURE, I WON'T GO INTO THE -- WE CAN BORROW INFLAMMATION THEORY WHERE (INDISCERNIBLE) IS USED TO MEASURE THE DIFFERENT DISTRIBUTION. WE CAN LOOK ON BIOPHYSICS AND USE DEVELOPMENT -- AND ZOOM IN ON THE STRUCTURE AND LOOK AT EVERY BASE PAIR AND HOW THE PROBABILITY OF FORMING THIS IS DIFFERENT IN ONE ASSEMBLY VERSUS THE OTHER ASSEMBLY. SO UNFORTUNATELY IT'S EASIER TO COME WITH THOSE MATHEMATICAL EQUATIONS AND TO COMPUTE THOSE EFFICIENTLY. SO THE IMPORTANT CONTRIBUTION HERE THAT (INAUDIBLE) IS TO COME UP WITH WHAT IS KNOWN IN THE SCIENCE AND PROGRAMMATIC APPROACH. WHENEVER YOU CAN DESCRIBE A PROBLEM USING A DYNAMIC PROGRAMMATIC APPROACH THAT USUALLY IMPLIES A SUFFICIENT ALGORITHM TO ANSWER THE QUESTION. SO NOW WE HAVE DESIGNING PROGRAM ALGORITHM TO MEASURE THOSE DIFFERENCES AND IMPORTANTLY, THOSE ARE NOT REDUNDANT MEASURES. THEY CAPTURE THE DIFFERENT ASPECTS OF THESE DIFFERENCES SO E WANTED TO SEE -- TO LOOK AT THE DISEASE ASSOCIATED SNPs AND SEE FOR DISEASE ASSOCIATED SNPS WHETHER WE SEE SIGNIFICANT CHANGES IN MESSENGER RNA STRUCTURES. WE USELYZE THE DATABASE -- UTILIZE THE DATABASE OF THE SNP IN FIVE PRIME UTR REGION TO SEE HOW MANY ARE ACCOMPANIED WITH SIGNIFICANT CHANGES IN MESSENGER RNA STRUCTURE IN THAT REGION. THERE'S ACTUALLY LOT UP HERE, I SAW THE BIGGEST OFFENDER SORTED BY RELATIVE ENTROPY, THIS IS A LIST OF DISEASES AN MUTATIONS, I WANTED TO POINT THE (INDISCERNIBLE) SYNDROME IS BECAUSE A THRESHOLD IS ALSO ASSOCIATED WITH OCCURRENCES OF THE NON-DNA STRUCTURE SPECIFICALLY OF THE FORM. THE SECOND THING WE WOULD LIKE TO SEE CAN -- CHANGES IN MESSENGER RNA EXPLAINED DISEASE ASSOCIATED SNPS. HERE WE'RE IN COLLABORATION WITH TWO GROUPS LOOKING AT GENES EXPERIMENTALLY CONSUMED TO HAVE THOSE SILENT MUTATIONS THAT AFFECT GENE FUNCTION SO THE MUTATIONS IN THE REGION THAT CHANGE IT IS PROBLEM BUT IT DOESN'T CHANGE THE AKNEE KNOW ACID SEQUENCE. SO THE PROBLEM IS NOT -- AMINO ACID SEQUENCE. THE PROBLEM IS IN SOMETHING THAT HAPPENS BEFORE, SO WE AND OUR COLLABORATORS WOULD LIKE TO KNOW THE MESSENGER RNA STRUCTURE AROUND THE SNPs AND THE SHORT ANSWER IS YES, THOSE ARE SIGNIFICANT CHANGES IN MESSENGER RNA STRUCTURE AT LEAST AROUND SOME OF THOSE SNPS. IN SUMMARY FOR THIS PART, WE HAD EFFICIENT COMPUTATIONAL THAT ALLOW US TO MEASURE THE NUMBER OF SNP MESSENGER RNA STRUCTURE THAT HELP US TO BREACH AN IMPORTANT WAY THIS GAP BETWEEN GENOTYPE AN PHENOTYPE. IF WE HAVE A DISEASE CAUSING SNP NOW WE CAN ASK THE NEXT QUESTION WHETHER OR NOT IT'S POSSIBLE THOSE ARE CHANGES IN MESSENGER RNA THAT ARE CAUSATIVE FOR THE UNDERLYING PERTURBATION. NOW I WOULD LIKE TO ZOOM OUT AND START TO TAKE THE NETWORK LEVEL APPROACH AND ASK THE QUESTION HOW DOES GENETIC PERTURBATIONS LIKE THE ONE IN THE PREVIOUS PART PROPAGATE THROUGH THE SYSTEM? ESPECIALLY IN THE CONTEXT OF DISEASES HOW THIS REGULATES PATHS. THIS QUESTION WE LOOK AT DIFFERENT TYPE OF GENETIC VARIATION, MAINLY COPY NUMBER VARIATION. THIS IS NOT BECAUSE THE MATTER IS NOT APPLIED TO OTHER GENETIC VARIATION, BUT BECAUSE THIS IS FOR THIS PARTICULAR VARIATION WE HAVE DATA THAT IS BEST SUITED AND YOU HAVE THE DATA AND YOU HAVE A WONDERFUL COLLABORATION TO LOOK AT THE COPY NUMBER VARIATION TOGETHER. SO COPY NUMBER VARIATION ACTUALLY VERY IMPORTANT FOR UNDERSTANDING DISEASES INDEPENDENT OF MY PRACTICAL POINTS THAT THIS IS WHERE WE HAVE DATA. NAMELY THAT A LOT OF UNDERLYING DISEASES SUCH AS CROHN'S DISEASE AND (INAUDIBLE), IT REVEALS MUCH MORE STRUCTURAL VARIATION CALLED FLEXOR VARIATION IN GENETICS. SO QUITE A LOT OF THOSE VARIATIONS IN THE POPULATION AND PERHAPS SURPRISINGLY INCLUDING COPY NUMBER VARIATIONS THAT AFFECT WHOLE GENOME OR WHOLE AXOMES. SO THEY'RE QUITE FREQUENT AND FREQUENT SOMATIC MUTATION IN CANCER. SO WE'LL START AGAIN WITH OUR COLLABORATIVE WORK WITH BRAD LEVER'S GROUP WHERE WE ARE TRYING TO DISSECT THE IMPACT OF GENE COPY NUMBER VARIATION ON EXPRESSION OF GENES IN (INDISCERNIBLE) FROM (INDISCERNIBLE) GROUP WHO DID ALL THE EXPERIMENTS AND A GREAT DEAL OF ANALYSIS, THEY ARE DOING ANALYSIS ESPECIALLY PART OF THAT RELATES TO NETWORK. WHAT YOU ARE DOING HERE, WE USE POWER OF FLIGHT KINETICSCH THOSE ARE LINES OF FLIGHT THAT THAT ARE DEFICIENT IN CERTAIN DNA REGIONS REMOVED. AND WHENEVER YOU HAVE THOSE REGIONS REMOVED, WHATEVER GENES ARE SUPPOSED TO BE IN THE REGION ARE NOW IN ONE COPY VERSUS IN TWO COPIES. SO WE HAVE THOSE FLYING COPY NUMBER VARIATIONS. SO IF THOSE HAVE FLIGHT IN DIFFERENT REGIONS AND WE MOVE IN OUR STUDY WE USE 21 OF THOSE LINES, MUCH MORE IN LIBRARY. SO WHAT YOU SEE@N$rQ)E ARE THIS STRUCTURAL REPRESENTATION OF ALL THE 21 LINES WITH ON CHROMOSOME DELETION, ALL ON CHROMOSOME 2 AND YOU CAN SEE WHERE ON THE DOT SHOW BLACK BARS TELLS YOU WHERE THE DELETION IS. SO THE FIRST QUESTION WE WANT TO ASK IS WE REDUCE FOR THOSE GENES WITH REDUCE COPY NUMBER FROM TWO TO ONE, WHAT HAPPENS AS EXPRESSION IN THOSE PARTICULAR GENES. HE WOULD EXPECT -- RETUESDAY THE COPY NUMBERS BY HALF EXPRESSION SHOULD DROP BY HALF. THIS IS NOT PRECISELY WHAT'S HAPPENING THOUGH FOR MOST GENES EXPRESSION DROPS. WHAT YOU SEE HERE IS A GRAPH, WE PLOT THE CHANGE OF EXPRESSION A LOG SCALE WITH RESPECT TO NORMAL VERSUS THE WILD TYPE. IF EXPRESSION DROP BY HALF YOU SHOULD SEE ALL THOSE DROP AROUND MINUS ONE. WE SEE SOME BUT CERTAINLY NOT ALL. WE SEE ACTUAL SPECTRUM OF THEM INCLUDING CORRECT NUMBER OF GENES WIDOW NOT CHANGE OR ALTER EXPRESSION AND MORE SURPRISINGLY ARE GENES WHICH DESPITE THE FACT THAT EXPRESSION IS REDUCED, COPY NUMBER IS REDUCED BY HALF, EXPRESSION IS LARGER AND QUITE A GOOD BIT WHICH GOES BELOW, EXPRESSION GOES LOWER THAN WHAT'S JUSTIFIED THAN BY REDUCING THE NUMBER OF GENES THE GENES BY HALF. NOW, THIS IS A REL TALLY YOU MAY THINK THIS IS NOT A LARGE NUMBER OF GENES THAT UP AND BEYOND OR JUST GO OUT OF CONTROL BOUNDS, BUT THIS IS THE SAME PICTURE WE ALSO SEE IN CANCER. WHEN YOU LOOK AT CORRELATION OF COPY NUMBER OF GENE EXPRESSION THE CORRELATION IS POSITIVE A GREAT DEAL GENES RELATION IS SURPRISINGLY -- IF THE COPY NUMBER GOES UP THE EXPRESSION GOES DOWN. SO WE WOULD LIKE TO UNDERSTAND THAT DEEPER SO TO DO THAT WE PUT INTO CONTEXT OF NETWORK. HERE WE HAVE A NETWORK THAT IS CONSTRUCTED BY NOT SO LONG AGO AND WHAT WE'RE TRYING TO SEE IN A MOMENT IS A MOVIE WHICH WILL BE IT RATED OVER ALL THIS 21 DELETIONS AND FOR EACH DELETION YOU SEE GENES OUTSIDE WHICH GO UP OR DOWN. I PUT THOSE CYCLES HERE SO THAT WHEN YOU HAVE -- I WANTED TO MAKE A POINT THAT THIS CHANGES NON-RANDOM. MAWNLY YOU SEE THE GENES THAT ARE IN THE SAME NEIGHBORHOOD TEND THE CHANGE IN TANDEM. THEY TEND TO BE UP OR DOWN OR AUTOMATIC. SO THAT SUGGESTS STRONGLY THE CHANGES WE -- THE WAY THOSE COPY NUMBER VARIATIONS AFFECT OTHER GENES ARE DEPENDENT ON THE NETWORK THAT WE HAVE IN HAND. SO REALLY WOULD LIKE TO UNDERSTAND THIS RELATIONSHIP BETWEEN COPY NUMBER VARIATION AND HOW THOSE COPY NUMBER VARIATION PROPAGATE TO AFFECT EXPRESSION OF OTHER GENES IN THE NETWORK AND LOTS OF GENES OUTSIDE WITH SOME PARTICULAR GROUP AFFECT THOSE GENES WHICH GO UP DESPITE THE FACT THAT WE HAVE THE NUMBER OF GENE DOSAGE. AS YOU REMEMBER 21 EMPERIMENTS SO WE CAN'T GO LOOK AT WHOLE NETWORK BEFORE WE DECIDED TO LOOK JUST AS THE FIRST NEIGHBORS. SO WE DEFINE THIS AS A GENE INTO THOSE THREE GROUPS AND LOOK AT THE FIRST NEIGHBOR OF ALL OF THOSE GENES IN THE NETWORK AND SEE WHAT THE FIRST NEIGHBOR OF ANY OF THOSE GROUPS CHANGE MORE THAN OTHER GROUP. WE FOUND FOR THE EXTREME GROUPS IN MANY DIFFERENT WAYS, THE NEIGHBOR OF THOSE GENES THAT ARE GOING OUT AND BEYOND HAVE MORE CHANGES THAN ALL THE OTHER GENES. SUGGESTING THERE IS A PROPAGATION OF THE SYSTEM FROM THOSE GENES TO THE NETWORK AND FEEDBACK FROM THE NETWORK TO THOSE GENES. AND THIS IS BASICALLY WHAT I SAID. COPY NUMBER EXPRESSION INk>x A NETWORK DEPENDENT WAY. NETWORK GUIDE THOSE PERTURBATIONS. AND ALSO WE HAVE INDICATION OF THAT THOSE GENES THAT ARE IN THE NETWORK ACT ON THE DELETED GENES TOO. NOW I THINK THIS IS MOSTLY MOTIVATION FOR ME TO GO TO THE NEXT -- TO JUSTIFY NEEDS TO DEVELOP AN APPROACH THAT IS GOING TO MODEL HOW THE PERTURBATION IN GENETIC MAY PROPAGATE THROUGH THE SYSTEM TO AFFECT OTHER GENES. WE ARE AFFECTED THOUGH THE APPROACH IS VERY GENERAL WE DEVELOP IN THE CONTEXT OF CANCER WHERE A NUMBER OF COPY NUMBER VARIATION AND THE DATA IS SUFFICIENT FOR US TO GO AND TEST THE APPROACH. AND I'LL STOP BY ACKNOWLEDGING (INAUDIBLE) WHO HAS BEEN HELPING THIS PART OF OUR WORK. SO MOTIVATION OF COURSE WE WOULD LIKE TO HAVE TO ANSWER THE QUESTION I JUST POSTED WITH THE BIGGEST PART BUT ALSO IN THE CONTEXT OF DISEASE WOULD LIKE TO IDENTIFY PATHWAYS OF DISREGULATED IN DISEASES. NOT WELL UNDERSTAND THAT COMPLEX DISEASES SHOULD BE DISTORTED FROM THE CONTEXT OF PATHWAYS, THERE MAYBE VARIATIONS WHICH END UP DISREGULATING THE PATHWAYS LEADING TO THE FACT THAT THERE HAVE BEEN A DIFFERENT CAUSAL MUTATION THAT IN THE END WILL GIVE THE SAME OR RELATED TYPE OF TUMOR. SO WOULD ALSO LIKE THE CAPTURE THE MOST IMPORTANT PLAYERS IN THIS NETWORK, PERHAPS NETWORKS IN THE DISEASES THAT HELP TO SUGGEST DEVELOPMENT ALSO WOULD LIKE TO BE ABLE TO IDENTIFY MANY MUTATIONS THAT WE HAVE IN SOME CELLS WHICH MUTATIONS ARE MOST PROMINENT, MOST IMPORTANT OR WHICH HAVE THE DRIVING MUTATIONS FOR THE CHANGES EXPRESSION CHANGES THAT WE OBSERVE. O HERE IS THE DATA WE HAVE A SET OF CONSULTATION DATA FROM COMPOSITION WHICH CONTAINS TWO TYPE OF DATA. EXPRESSION DATA AND THOSE EXPRESSION ARE COMPARED TO NON-CANCER SO WE KNOW WHICH GENES ARE OVEROR UNDER EXPRESSED, AT THE SAME TIME WE HAVE COPY NUMBER VARIATION FROM THE SAME SET OF PATIENTS. THE QUESTION IS HOW THE PERTURBATIONS THAT STARTED FROM COPY NUMBER VARIATION PROPAGATE THROUGH THE SYSTEM TO AFFECT THE EXPRESSION OF THE GENES. SO WE WOULD LIKE TO PUT IT IN THE CONTEXT OF THE NETWORK AND WE INTEGRATE PROTEIN PROTEIN, PROTEIN DNA AN PHOSPHORYLATION NETWORK THAT ARE OBTAINED FROM HIGH THROUGH PUT EXPERIMENTS. BEFORE I -- HOWEVER, LET ME TAKE THE NETWORK AWAY AND LET ME REMIND YOU ONE OF THE SLIDES WHEN I SAID THAT OUR FIRST STEP UNCOVERING THE RELATIONSHIP BETWEEN GENOTYPE AND PHENOTYPE ARE DISSOCIATION STUDIESCH GENE EXPRESSIONS ARE CONSIDERED AS A PHENOTYPE. SO THE FIRST STEP WE WOULD LIKE TO MAKE IS TO FOR EACH OF THOSE GENES THAT ARE EXPRESSED ABNORMALLY TO SEE WHETHER EXPRESSION OF THESE GENES IS CORRELATED WITH COPY NUMBER VARIATION. SO WE TAKE GENE EXPRESSION PHENOTYPE AND THIS IS EXACTLY WHAT IS KNOWN AS ANALYSIS, EXPRESSION ANALYSIS AND ANAL SITS IS KNOWN TO HAVE IMPORTANT PROBLEMS. AND THE IMPORTANT PROBLEM IS THAT WE HAVE THOUSANDS OF GENES, WE HAVE THOUSANDS OF VARIATIONS THEN WE WILL TEST PAIRS AND WE DON'T HAVE ALL THE SAMPLES TO REALLY RECOVER FROM THE MASSIVE MULTIPLE THE TESTING ISSUE WHEN WE DO THAT. WE CANNOT GET AND COMPARE SO WE HAVE TO BE SMARTER ABOUT THAT. SO WHAT WE DECIDED THE FIRST STEP WE'RE GOING TO FIND TARGET GENES. GENES EXPRESSIONED AND CONTROL WHICH ARE MARKERS OF THE DISEASE OR WHICH ARE REPRESENTED -- REPETITIVE OF THESE DISEASES. IN ORDER TO IDENTIFY THE SET OF GENES WE FORMALIZE THE QUESTION AS AN OPTIMIZATION QUESTION AND WE REPRESENTED THE RELATIONSHIP USING GRAPH THEORY, THIS IS SOMETHING I WOULD LIKE TO SKIP OVER IN THE INTEREST OF TIME AND JUST SUMMARIZE WE USE OUR TOOLBOX TO COME UP WITH A SET OF GENES THAT REPRESENT THE DISEASES, SO THAT WE ARE CONFIDENT THAT IF WE EXPLAIN THE CHANGES OF THOSE GENES, WE WILL GAIN QUITE A LOT OF UNDERSTANDING OF WHAT'S CHANGING. SO NOW WE HAVE A SMALLER SET, MUCH SMALLER SET AND NOW WE CAN AFFORD TO DO DISSOCIATION AND SOME OF THOSE MIGHT TURN OUT TO BE SIGNIFICANT. SO UP HERE MARKED BY -- WHENEVER YOU HAVE A SIGNIFICANT ASSOCIATION BETWEEN COPY NUMBER VARIATION AN EXPRESSION VARIATION, WE KNOW THIS IS SUGGESTED IN THIS REGION, SOMETHING IN THIS REGION, ONE OF THOSE GENES THERE MAYBE HUNDREDS OF THEM IN THIS REGION THAT AFFECT DIRECTLY OR INDIRECTLY THIS TARGET GENE HERE. SO NOW IT IS TIME FOR US TO PUT OUR NETWORK I WOULD NOW LIKE TO EXPLAIN HOW IT'S POSSIBLE THAT A SIGNAL TRAVELS FROM ONE OF THOSE GENES, WE SIMPLIFY THE QUESTION AND ASSUME THAT THIS IS ONE OF THOSE GENES. SO HOW IT'S POSSIBLE FROM ONE OF THE GENES THAT ARE IN THIS REGION WITH COPY NUMBER VARIATION TO THE TARGET GENE. NOW, AS FAR AS AS THOSE HERE WE -- THE NETWORK THAT WE HIGH THROUGH PUT MATTER OBTAINED OUT OF CONTEXT HAS BEEN OBTAINED IN VITRO SO IT'S BUT IN VIVO IT'S NOT IN THE (INDISCERNIBLE) WE STUDIED. SO IT IS SOMETHING BUT WE CANNOT RELATE EXACTLY THOSE, WE NEED TO SOMEHOW PUT IT IN THE CONTEXT OF OUR PARTICULAR DISEASE, IN THE CONTEXT OF CANCER. FORTUNATELY WE HAVE THE EXPRESSION DATA FOR ALL GENES INCLUDING THE GENES IN THE NETWORK. SO WE CAN USE IT THEN WE REASON MANY WAYS IN WHICH THOSE GENES CAN CONNECT TO THESE GENES, THE PATHWAYS THAT EXPRESSION OF THOSE CORRELATE WITH EXPRESSION OF TARGET GENES IS MORE LIKELY TO (INAUDIBLE). SO WE LIKE TO FIND A PATHWAY THAT CONNECTS THIS REGION WITH THIS REGION CHOOSING THAT ARE CORRELATED WITH OUR TARGET GENES. TO BIAS OTHER TYPE OF PATHWAY WE REPRESENT ALL THESE NETWORK (INDISCERNIBLE). SO THIS ALLOW US TO PUT OUR RESISTANCE ON THIS AND WHENEVER THE RESISTANCE ARE BEING DEFINED BY HOW WELL THE TWO ADJACENT NODES, EXPRESSION OF TWO ADJACENT NODES CORRELATE WITH EXPRESSION OF THE TARGET GENE. SO IN THIS WAY WE CAN BIAS THE CURRENT TO FLOW THROUGH THE LOWEST RESISTANCE PATH, DOESN'T HAVE TO BE JUST -- DOESN'T HAVE TO BE A LINEAR PATH, MAYBE A SUB NETWORK REALLY. BUT NOT ONLY CAN WITH FIND THE NODES THAT FACILITATE THE CONNECTIONS BETWEEN THIS AND THIS REGION, BUT ALSO BY THIS APPROACH IDENTIFY ONE OF MANY GENES THAT MAY HAPPEN AND SOMETIMES OVER HUNDREDS OF THEM THAT MAYBE IN THIS ENGINE LOCUST WHICH IS MORE LIKELY CAUSAL FOR THE VARIATION. SO WITH THIS APPROACH SO FAR, WE HAVE REALLY WE CAN PUT HANDS ON THREE INTERESTING SET OF GENES. THE FIRST SET IS THE SET OF OUR MARKER GENE, THE GENES THAT ARE WE SELECTED AS REPRESENTATIVE GENES FOR DISEASES. THE LIST OF THIS GENE IF YOU ARE DOING CANCER RESEARCH AND PROBABLY FIND YOUR GENES ON THIS LIST. JUST TO MENTION THE RAS ARE GENES THAT ARE UP REGULATED AND DOWN REGULATED. THE SECOND SET IS THE CAUSAL GENES. THE GENES WITH COPY NUMBER VARIATION ARE MORE LIKELY TO CODE THE PERTURBATION OF THOSE GENES MEANING THERE'S SIGNIFICANT CURRENT FROM THOSE GENES TO THOSE GENES. AGAIN HERE THE GENES WHICH ARE AMPLIFIED BY COPY NUMBER VARIATION, THESE ARE THE GENES THAT ARE DELETED. IN BLUE ARE SOME KNOWN GENES THAT ARE KNOWN TO BE CRUCIAL FOR GLIOMA. IN THE MOUSE WE CAN CAPTURE THE NULL THAT BELONGS TO MANY PATHWAYS, NETWORK AND AGAIN AMPLIFIED IN THE -- OVER EXPRESSED IN THE KAREN GREEN ARE UNDEREXPRESSIONED IN THE CANCER AND THAT HAPPENS TO BE IN OUR NETWORK BUT IF IF YOU LOOK AT THE EXPRESSION YOU WOULDN'T SEE A CLEAR SIGNATURE THAT THEY ARE ABNORMAL OTHERWISE. AND AGAIN YOU RECOGNIZE KNOWN CANCER GENES AND LEADING THE WAY IS WITH THE LAST 100 PATHWAYS HERE. SO THESE ARE THE GENES IN THOSE PATHWAYS. WE'RE GOING TO IDENTIFY PATHWAYS THAT ARE DISREGULATED BY THOSE COPY NUMBER VARIATIONS. TO DO SO, REMEMBER WE HAVE NOW A LOT OF PAIRS THAT WE IDENTIFY THAT COPY EXPRESSION OF THIS PACKAGE GENE IS CORRELATED WITH THE COPY NUMBER VARIATION OF THIS REGION SO WE HAVE LOTS OF THOSE PAIRS. EAR ARE TWO OF. FOR EACH OF THOSE PAIRS FOR SOME OF THEM WE COULDN'T FIND STATISTICALLY SIGNIFICANT CURRENT GOING HERE AND THERE AND THEN WE DROP THEM. SO FOR MANY OF THESE PAIRS WE COULD FIND STATISTICALLY SIGNIFICANT CURRENT FLOW BETWEEN THIS REGION AND THIS GENE AN THE DOTS HERE REPRESENT THE GENES THAT ARE SIGNIFICANTLY AFFECTED BY THIS CURRENT. I DROP THIS EDGE BECAUSE REALLY WE WOULD LIKE TO THINK ABOUT THOSE GENES. WE HAVE TO KEEP IN MIND THAT OUR NETWORK ARE VERY NOISY AND WHY IN GENERAL WE HAVE A GOOD INDICATION THAT THE GENES THAT ARE CLOSE IN THE NETWORK ARE RELATED, I WOULD BE NOT SO GREAT TO BE SURE THAT THIS EXACTLY -- THE SIGNAL BIOLOGICAL SIGNAL EXACTLY IS REPRESENTED BY THE CURRENT. SO I'M LOOKING AT THEM FROM THE PERSPECTIVE OF THE GENES AND I I'M ASKING THE QUESTION IF I LOOK AT THOSE BACK UP GENES WHAT IS THE INTERSECTION? THE GENES IN THE INTERSECTION, IF THEY BELONG TO SOME SPECIFIC BROAD PATHWAY, WE WILL GET A HOLD OF WHICH PATHWAYS ARE REALLY OFTEN DISREGULATED BY THOSE DIFFERENT MUTATIONS. HERE ARE THE LIST OF THOSE PATHWAYS THAT WE DISCOVERED, PATHWAYS THAT ARE IN THE OVERLAP AND TO DEFINE THE PATHWAY WE USED GENE ONTOLOGY ANNOTATION USING THE THE MOST SPECIFIC TERM THAT WE CAN APPLY FOR A GIVEN SET, YOU COULD SEE THAT CELL CYCLE IN OVERLAP FOR EPIDEMIC GROWTH FACTOR RECEPTOR WHICH WAS NINE AND SO SO ON AS THERE WAS A LIST HERE OF ALL THE PATHWAYS THAT ARE PURE, MORE THAN ONE IN MORE THAN ONE PATHWAY, MORE THAN THE BACK UP GENES SO AGAIN A REALLY -- AS YOU LOOK AT THE PATHWAYS, NOT SURPRISED TO SEE THEM BE INVOLVED IN CANCER, SOME OF THEM ARE PERHAPS MORE SURPRISING AND THOSE PROBABLY ARE MORE INTERESTING. THEY MAY ALSO BE POSITIVE TOO. SO WE HAVE A NUMBER OF ADVANTAGES TO THIS APPROACH. FIRST OF ALL IMPORTANT THAT WE COULD USE EXPRESSION ENTERACT ON DATA TO PUT SOME CONTEXT, TO PUT THE CONTEXT OF THE DISEASE WITH WHICH WE ARE WORKING. CURRENT FLOW IS -- THIS IS A BALANCING OF THE EQUATION THAT WE CAN SOLVE E FUSHTLY. THOSE ARE HUGELY EQUATIONS, NUMBER OF VARIABLES IS EQUAL TO THE NUMBER OF GENES SO IT'S NOT (INDISCERNIBLE) PROBLEM AND WE ALSO NEED TO SAY WHETHER THE CURRENT WAS STATISTICALLY SIGNIFICANT OR NOT SO WE NEED TO HAVE EXPERIMENTATION TESTS SO WE SAW THIS SET OF (INDISCERNIBLE) SO WE DEVELOP AUTOMATIC APPROACH WHICH ALLOWS US TO REUSE PARTIAL SOLUTION AS WE DO OUR PERMUTATION TEST. THIS IS NOW IN PRINT. MANY OF THE ADDRESS THAT WE HAVE IN THE NETWORK, THOSE ARE PROTEIN DNA INTERACTIONS, PHOSPHORYLATION, THEY SHOULD REALLY BE DIRECTED. AND WE EXTEND THE FIELD APPROACH DIRECTED TWO WAYS OF DOING THAT, YOU CAN PUT CONSTRAINT TO FORCE IT INTO ONE DIRECTION OR HAVE A HEURISTIC APPROACH. SO I ACKNOWLEDGE THIS IS THE FIRST APPROACH IN THIS PART OF THE LITERATURE THAT IDENTIFY PATHWAYS THAT ARE DISREGULATED THAT CONNECTS GENOTYPE AND PHENOTYPE AND THIS JUST CAME COUPLE OF MONTHS AGO IN COMPUTATIONAL BIOLOGY. OKAY. WE HAVE GOT -- MY LAST SEVEN MINUTES I WILL USE TO GO EVEN HIGHER AND OBSTRUCTION AND LOOK AT THE INFORMATION THAT WE CAN GAIN ABOUT THE MOLECULAR SYSTEM ASSOCIATION INVOLVED. HERE I'M GOING TO LOOK AT THE COG MITIVE INTERACTION. LET'S START BY DEFINING EPISTATIC INTERACTION BECAUSE THERE'S RAREIOUS WAYS OF DEFINING THEM. EPISTATIC INTERACTION WHAT I'M GOING TO UNDERSTAND FOR THIS PRESENTATION INTERACTION IS THE SITUATION WHEN YOU HAVE TWO GENES. WHICH ARE ARE -- AFFECT THE SAME PHENOTYPE BUT THAT AFFECT IS NOT INDEPENDENT. MEANING THAT WHATEVER AFFECT THIS HAS ON THE PHENOTYPE -- GENOTYPE CAN BE MODIFIED BY THIS PHENOTYPE. THIS CAN BE EXPLAINED IN SYMPATHETIC LOCALITY AND I SWITCH TO THIS BECAUSE THE REST OF MY TALK WILL REFER TO THIS ORGANISM. SO THE IDEA IS THAT WE DELETE A GENE AND WE LOOK AT THE ORGANISM CAN STILL GROW AND I ASSUME IT CAN DO THAT. IT STILL GROW. WHEN YOU MOVE BOTH OF THEM, IT DOESN'T. SO THAT SUGGESTS THAT THERE IS SOME HIDDEN RELATIONSHIP BETWEEN THOSE TWO GENES BECAUSE IF THEY WERE INDEPENDENT WE WOULD EXPECT THAT IT SHOULD BE FINE IN BOTH OF THEM. SO THIS INTERACTION AS WE CALL IT, EPISTATIC INTERACTION, AND THAT PARTICULARLY IS IMPORTANT IN THE STUDY OF DRUG RESISTANT PHENOTYPE. SO HERE IN THIS STUDY WHICH WAS NOW ACKNOWLEDGE (INDISCERNIBLE) IN MY GROUP. WE USED GENETIC CROSSES TO UNCOVER EPISTATIC INTERACTION IN CONTEXT OF DRUG RESISTANCE. SO THE IDEA IS WE HAVE TWO THAT HAVE DIFFERENT GENETIC BACKGROUNDS. SO OUR APPROACH, WE ASSUME THAT THEY HAVE THE SAME PHENOTYPE. THEY WILL BE BOTH DRUG RESISTANT. WHEN WE CROSS THEM WE OBTAIN A NUMBER OF PROGENIES, SOME MIGHT BE DRUG RECESS STAN, SOME MIGHT NOT. SO WE HAVE THOSE TWO PARTS THAT ARE RESISTANT. THE GENOME FOR PART 1 AND THEN FOR THE THE WE HAVE THE MOSAIC OF THOSE PARENTAL STRAINS. WE ALSO HAVE THE PROGENY, THE PHENOTYPE WHICH IS MIX 1. HOW THE EPISTATIC INTERACTION MIGHT EXPLAIN THAT TYPE OF SITUATION? ONE EXPLANATION, THEY MAYBE MANY EXPLANATIONS WUK ONLY FIND ONES CONSISTENT WITH OUR MODEL. O OUR MODELS FOLLOW, IT'S POSSIBLE THAT WE HAVE TWO GENOMIC LOCATIONS. BUT THAT BY WII INTERACTS WITH EACH OTHER WHICH THOSE INTERACTIONS DEVELOPED INDEPENDENTLY IN EACH OF THE STRAINS. WHENEVER WE HAVE BOTH OF THOSE LOCATIONS INHERITED FROM THE SAME PART, THEY DON'T HAVE THE STANDARD DRUG. ONE IS INHERITED FROM ONE PARTNER, ANOTHER IS INHERITED ON THE OTHER PART. SO THEY MAY HAVE THE COMPATIBILITY, THEY MAY SAY -- OBTAIN INDEPENDENTLY IN EACH STRAIN AND INTERACTION IS BROKEN AND WE LOST THE RESISTANCE TO THE DRUG. SO WE DECIDED TO TRY THIS MODEL AND SEE WHETHER BY STARRING WITH THIS ASSUMPTION WHETHER WE CAN FIND SOME INTERESTING INTERACTIONS THAT CAN BE JUSTIFIED. SO IN GENERAL WE ARE LOOKING FOR THIS PART OF -- ECONOMIC LOCATION, SO THAT WHENEVER THEY BOTH ZERO OR BOTH ONE THEY USE THE RESISTANCE OF DRUG BUT WHEN ONE IS ZERO AND ONE IS ONE WE HAVE ANOTHER STRAIN. IN CASES WHEN YOU HAVE -- IT'S EASIER TO SAY THE PROBLEM THAN TO SOLVE IT BECAUSE IT WAS IN THE ANALYSIS IF WE TRY ALL THE POSSIBLE PAIRS, WE ARE LOSING STATISTICAL POWER BECAUSE WE DON'T HAVE THE DATA TO COMPENSATE FOR LOSS OF POWER TO THE MULTIPLE HYPOTHESIS TESTING. THEREFORE T FIRST STEP IN THIS APPROACH WAS REALLY TO USE IN THIS CASE A GRAPHICAL APPROACH TO SOMEHOW ZOOM TO FILTER THE PAIRS MORE PROMISING SO WE WILL ONLY TEST A SMALL NUMBER OF THEM. SO WE APPLY THIS TO EASE DNA REPAIR PHENOTYPE, THIS IS DATA AVAILABLE THROUGH ONLINE FROM THE STUDIES DONE IN 2008. WE LOOK AT THE RESPONSE TO NPO TREATMENT BECAUSE IN THIS PARTICULAR CASE WE HAVE THE SITUATION THAT IS CONSISTENT WITH OUR MODEL, NAMELY BOTH PARTS WERE RESISTANT. HOW ABOUT PROGENIES WE HAVE 31 SENSITIVE AND 53 RESISTANT. APPLYING OUR MAP WE IDENTIFY 5 ENTERACTING. AND INTERESTINGLY THE FIVE WERE ALWAYS ENTERACTING WITH L-2. L-1 WITH L-3 AND SO ON. SO INTERACTING WITH FOUR, FIVE OTHER LOCI SHOWN IN THIS PICTURE. SO WHEN DOING THIS COMPUTATION WE GOT WORD THAT TRAY IDEKER AND HIS GROUP DOING INTERACTION EXPERIMENTALLY IN THE CONTEXT OF AGAIN DNA DEMON STRAY PHENOTYPE. THEN YOU USE EXACTLY THE SAME CHEMICALS AS WE DID. BUT WE THOUGHT THAT IF -- I SKIPPED ONE. THANKS. WE LOOK AT THE LOCK WITH L-1 MANUAL TO SEE WHAT'S IN IT. AND LOOKING BY GENE THERE WAS ONE GENE RAD 5 WHICH ATTACKS THAT INVOLVES IN THE DOUBLE STRAND BREAK REPAIR PROCESS. WE HYPOTHESIZE THIS GENE IS ACTUALLY HALF OF THOSE EPISTATIC INTERACTIONS SO NOW TO TEST THE THE HYPOTHESIS WE ASK TRAI TO LOOK AT HIS DATA AND IN HIS PHENOTYPE WHEN THEY GO PAIR BY PAIR, THIS RAD 5 OCCURS AS A HAPPEN AND IN FACT IF YOU CAN CONFIRM THAT THERE'S ALSO SOME OTHER GENES THAT HE DISCOVERED IN THIS LOCK THAT WILL INTERACT WITH RAD 5 AND SO ON AND HE CONFIRMED THOSE PREDICTIONS SO THAT SHOWS THAT THIS VERY ABSTRACT WAY OF LOOKING AT THE THE INTERACTION, SIMPLE APPLICATION, TO ZOOM IN AND IDENTIFY DRUG RESISTANT GENE. THIS IS INTERESTING BECAUSE THE KIND OF THING YOU CAN TAKE ANY PARASITE THAT IS INDEPENDENTLY REQUIRE -- ACQUIRED DRUG RESISTANCE OR CAN EVEN FORCE IT IN TO LAB AND GROW IT UNTIL IT GETS RESISTANT DRUG THEN CROSS IT AND YOU CAN APPLY THIS MATTER TO PINPOINT WHAT ARE THE GENES IMPORTANT FOR DRUG RESISTANCE, THIS IS REALLY PERFORM TO KNOW. WITH THAT, I'M OUT OF TIME SO TIME FOR ME TO CONCLUDE. I HOPE I CONVINCE YOU THAT WITH THOSE VARIOUS LEVEL APPROACHES STEP OF APPROACHES USING COMPUTATIONAL INSIGHT TO SUPPORT THE EXPERIMENTAL DATA WE CAN START BRIDGING THE GAP FROM GENOTYPE TO PHENOTYPE. I SHOW YOU'RE WORK ON HOW VARIATION IN DNA AND RNA CONFIRMATION MAY AFFECT EXPRESSION OF THE GENE. HOW THOSE VARIATION IN GENE EXPRESSION WITH PROPAGATE THROUGH THE NETWORK DISREGULATED THIS WHOT PATHWAY AND HOW TO USE COMPUTATIONAL METHOD TO PROPOSE PATHWAYS DISREGULATED IN THE DISEASES. HYPOTHESIS NEEDS TO BE TESTED IN THE LAB TO BE CONFIRMED. ALSO THAT HOW THOSE COMPUTATIONAL METHODS CAN SUGGEST AN EXPERIMENTAL TECHNIQUE TO IDENTIFY GENES INVOLVED IN DRUG RESISTANCE. OUR COLLABORATORS AS I WAS -- I COULDN'T DO IT WITHOUT WONDERFUL FELLOWS IN MY GROUP, THE PEOPLE AND WONDERFUL COLLABORATORS. ALSO AS YOU CAN SEE THEY HAVE DIFFERENCE EXPERTISE. SO (INDISCERNIBLE) IN COLLABORATION YANG HUANG DID THE EPISTASIS AND GRAPH THEORY, YRKOO-AH KIM DID THE CANCER NETWORKS OPTIMIZATION. RAHELEH SALARI AND DAMIAN WOJTOWICZ. AND I'M VERY GRATEFUL TO THEM AND TO YOU FOR HAVING ME. THANK YOU. [APPLAUSE] >> I'M SURE TERRY WOULD BE HAPPY TO ENTERTAIN QUESTIONS. IF YOU HAVE THEM TRY TO GET TO THE MICROPHONE BECAUSE WE ARE BROADCASTING THIS. CHUCK. >> THANK YOU VERY MUCH. A WONDERFUL TALK. WE HAVE A QUESTION FOR YOU GOING BACK TO YOUR DROSOPHILA DELETION STUDIES AND TRYING TO GET A FEEL FOR REDUCTIONIST GENETICS HAT HOW PRACTICAL IT IS TO ACTUALLY IDENTIFY THE MASTER OR CAUSAL GENES IN THAT AS OPPOSED BASICALLY PULLING THINGS OUT OF POTENTIAL BACKGROUND NOISE. AS EXAN PM PL, ONE COULD ENVISION INSTEAD OF LOOKING AT WHOLE ADULT ROUND UP SAY YOU SEPARATED THE HEAD FROM THE REST, YOU MIGHT SEE DIFFERENT PATTERNS OR IF YOU STAGE THEM HOW OLD THEY ARE, YOU MIGHT SEE PRETTY PATTERNS AN EVEN THAT IT'S CRUDE PATTERNS. HOW MUCH NOISE IN THERE? >> THE MASTER REGULATION, REGULARLYTOR AND WE LOOK AT THE CANCER DATA WHICH ARE IN ONE CANCER TISSUE, YOU'RE COMPLETELY RIGHT. THIS IS WHERE WE'RE HAVING -- WE'RE TRYING TO LOOK AT THE INDIVIDUAL TISSUE AND SEE HOW THE NETWORK AND EVEN HOW THE NETWORK WILL BE CONSTRUCTED BASED ON EXPRESSION CORRELATION DIFFERS BETWEEN THESE TISSUES. THAT IS THE FIRST QUESTION. OF COURSE THE FACT THAT WE HAVE ALL THE DISEASES TOGETHER MUDDY IT IS PICTURE A LITTLE BIT. >> I THOUGHT THEY WANTED TO ANSWER IT. >> SO SEVERAL TIMES YOU MENTION THE FACT THAT YOU HAVE EITHER A LIMITED AMOUNT OF DATA IN TERMS OF SAMPLES OR TOO MUCH DATA IN TERMS OF INTERACTIONS. WONDERING WHAT YOUR THOUGHTS ARE ON SCALING EXPERIMENTS LIKE GWAS HOW MANY PATIENTS DO YOU NEED AND HOW DOES THAT RELATE TO HOW MANY DIFFERENT POSSIBLE INTERACTIONS THERE ARE IN THESE PATHWAYS. SEEMS TO ME LIKE A LOT OF PEOPLE ARE GOING AND DOING THESE EXPERIMENTS WITHOUT REALLY THINKING A LOT ABOUT HOW THEY'LL ANALYZE THEM LATER. >> THE BIGGEST PROBLEM, THE FACT THAT MOST OF OF THE PHENOTYPES THAT WE WOULD LIKE ARE COMPLEX. WE DON'T KNOW IN ADVANCE HOW COMPLEX IT IS. SO IN SIMPLE PHENOTYPES YOU HAVE ONE TO ONE RELATIONSHIP BETWEEN GENETIC VARIATION AND IF YOU ARE LUCKY, THAT YOUR PHENOTYPE, WHEN YOU CAN GET AROUND WITH WHATEVER OF CASES. IF PHENOTYPE IS COMPLEX, I DOUBT ASSOCIATION STUDY WILL ALL BY ITSELF, YOU HAVE TO HAVE SAMPLES AND THAT'S WHY WE'RE LOOKING AT THIS NETWORK LEVEL BECAUSE IT PROVIDES ADDITIONAL OR ORTHOGONAL JUSTIFICATION WHY THOSE TWO ARE RELATED. BECAUSE NOT ONLY YOU FIND ASSOCIATION, WE HAVE SOME COULDN'T OFF HERE. THIS COUNT OFF, IF YOU TAKE A STUDY WOULD BE THAT. THIS IS TOO LITTLE. SO THOSE ARE POSSIBLE INTERACTIONS BUT IT WILL NOT SURFACE MOST OF THEM THAT WE FOLLOW WITH THE NETWORK APPROACH WHICH OF THOSE WE CAN SEE SIGNIFICANT CURRENT GOING BETWEEN HERE AND THERE. THEN YOU CAN COMPUTE THE FUTURE FOR THIS. THE FUTURE IS FOR THE GWAS STUDY THAT WE DO, NETWORK LEVEL ADDITIONAL EVIDENCE TO SOMEHOW REDUCE THE IMPACT OF THOSE PHENOTYPE COMPLEX. >> I HAVE SET OF GENERAL QUESTIONS, TERRY. IN CONSIDERING THE COMPLEXITY OF BIOLOGICAL SYSTEMS, IS IT CONCEIVABLE THAT THE MATHEMATICAL TOOLS THAT WE CURRENTLY HAVE ARE NOT -- REALLY JUST AN APPROXIMATION TO WHAT WE WERE EVENTUALLY NEED TO UNDERSTAND SOME OF THESE NETWORKS. ARE THERE ADDITIONAL MATHEMATICS THAT NEED TO BE DEVELOPED THAT COULD HELP IN BIOLOGISTS ANALYZE COMPLICATED SYSTEMS? >> [LAUGHTER] >> I THINK THIS IS THEY HAVE TO DEVELOP AS THE BIOLOGY INSIDE IS RISING SO DEPENDING ON THE DATA AVAILABLE. OF COURSE WE HAVE QUITE THE TOOL OF MATHEMATICAL TOOLS WE CAN USE MATHEMATICAL EQUATIONS TO IMAGINE THINGS PRECISELY BUT WE DONE HAVE DATA TO DO THAT. SO IT WILL HAVE TO BE DEVELOPED ALL THE TIME BECAUSE WE'RE SURPRISED BY THE TYPE OF DATA WE HAVE AND IN ORDER TO MINE THE DATA, IN ORDER TO UTILIZE IN THE BEST POSSIBLE WAY WE HAVE TO LOOK BACK AND SEE WHAT WAS THE MATHEMATICS. IN AN IDEAL WORLD WE HAVE NETWORKS WHICH ARE PRECISELY DEFINED WHICH HAVE ALL THE EQUILIBRIUMS AND ALL THE PARAMETERS THAT -- AND WILL HAVE COMPUTERS -- I DON'T THINK THIS IS GOING TO HAPPEN SOON. >> IF THERE ARE NO OTHER QUESTIONS, LET'S THANK TERRY FOR A VERY STIMULATING TALK. [APPLAUSE]