>> WELCOME TO THE NEXT INSTALLMENT OF BIOWULF SEMINAR. TODAY WE'RE LUCKY TO HAVE RAZVAN CHEREJI. HE GOT HIS DEGREE IN 2013 AT RUTGERS UNIVERSITY AND HAS HIS Ph.D. IN THE CHILD HEALTH AND HUMAN DEVELOPMENT WHERE HE'S BEEN WORKING ON NUKE REMODELLING FACTORS. WELCOME, RAZVAN. >> THANK YOU VERY MUCH FOR INVITING ME AND THANK YOU FOR COMING HERE TODAY. SO IN THE PREVIOUS SEMINAR WE LEARNED TO BUILD DOG USING TWO BILLION STEPS.I'LL SHOW YOU HOW TO FIT MILLIONS OF METERS OF DNA INSIDE THE HUMAN BODY. LET'S START WITH FUN FACT. SO IT'S ESTIMATED THAT THE HUMAN BODY CONTAINS 300 MILLION METERS OF DNA. ENOUGH TO DO FROM HERE TO THE SUN AND BACK MORE THAN 300 TIMES. IT'S ALSO ENOUGH TO CIRCLE THE ARCTIC METER 300,000. SO IS IT TO PACK SO MANY MOLECULES INSIDE OUR CELLS. BUT THE DNA PACKAGING CONTAINS 147 BASE PAIRS OF DNA WRAPPED IN THESE. HEAD TO HERE IN DIFFERENT COLORS. THE DNA IS CALLED THE LINKED DNA AND THE NUKE DEF I'S PRECUED PRECLUDED FROM ACTING WITH OTHERS AND IT AFFECTS ALL THE DNA RELATED PROCESS IN THE CELLS. FOR EXAMPLE, IT AFFECTS TRANSCRIPTION, DNA REPLICATION AND SO FORM. IN THE CASE OF TRANSCRIPTION IT'S IMPORTANT TO KNOW THE ORGANIZATION OF THE GENE PROMOTERS. FOR EXAMPLE, IF THIS IS A GENE AND THE PROMOTER IS OCCUPIED BY A NUCLEOSOME THE TRANSCRIPTION FACTORS MAY NOT BE ABLE TO FIND THEIR TARGET SITES AND CAP -- CANNOT AC STRAIGHT THE GENE BUT ACTIVATE THE GENE BUT IF THEY CANNOT FIND THE TARGET SITES THEY CAN REGULATE THE GENE EXPRESSION. SO WE ARE INTERESTED TO MAP THE NUCLEOSOME POSITIONS ON THE GENOME LEVEL. HOW CAN WE DO THIS? THE BASIC -- THE TYPICAL METHOD OF MAPPING NUCLEOSOMES IS CALLED MNA. FIRST THE FIBER IS ISOLATED FROM THE CELLS AND ADJUSTED AND IT CUTS THE LINKER DNA AND THEN IT'S SEQUENCED AND MAPPED TO THE GENOME. DOING THIS FOR EVERY GENOMIC BLOCK WE CAN FIND THE CLUSTER OF DNA SEQUENCES WHICH ORIGINATE FROM NUCLEOSOMES FROM DIFFERENT CELLS. IF WE COMPUTE THE NUMBER OF SEQUENCES WHICH OCCUPY EVERY BASE PAIR IN THE GENOME WE OBTAIN THE PROFILE FOR THE NUCLEOSOME COVERAGE PROFILE. HOWEVER, IF WE ONLY MAP THE CENTERS OF THE FRAGMENTS WE OBTAIN THE CENTERS OR THE DISTRIBUTION. UNFORTUNATELY, THERE ARE A FEW PROBLEMS WITH THIS METHOD. FIRST, THE DNA SEQUENCES ARE DIGESTED AT DIFFERENT RATES. IF YOU DIGEST A CHROMEATTIN YOU MAY GET DIFFERENT PROFILES. HERE IS THE PROFILE OR THE PROFILE OBTAINED BY STACKING ALL THE DNA FRAGMENTS THAT ARE OBTAINED BY DIGESTING THE CHROMATIN AND YOU CAN SEE IN THE REGIONS YOU OBTAIN HIGHER OCCUPANCY PROFILES THE MORE YOU DIGEST THE CHROMATIN SO YOU CAN OBTAIN HIGHER LEVELS. WHILE IF YOU DIGEST OTHER REGIONS YOU SEE LOWER OCCUPANCY PROFILES THE MORE YOU DIGEST THE CHROMATIN. THE PROFILE IS NOT VERY PUTATIVE. FURTHERMORE, EVEN OTHER PROTEINS CAN BE MISTAKEN AS NUCLEOSOMES IF THEY OFFER STRONG PROTECTION AGAINST MNAs. IF YOU LOOK AT THESE TWO PEAKS HERE WHICH HAVE A HIGH OCCUPANCY IN THE MNA EXPERIENCES CORE SPONDING YOU CAN SEE THEM DISAPPEAR WHERE YOU MAP PRECISELY THE HISTON CONTENTS. THEY'RE THE FOOTPRINT OF A DIFFERENT PROTEIN, NOT A NUCLEOSOME BY DEFINITION CONTAINS HISTONES SO TO ELIMINATE THE PROBLEM IN COLLABORATION WITH THE LAB WE DEVELOPED A NEW METHOD WHICH DOESN'T RELY ON MNA AND IT'S A DIFFERENT METHOD OF MAPPING. AGAIN, THIS IS THE STRUCTURE OF IT. WE MUTATED THE HISTONE IN BLUE BY TAKING GLUTAMIN AND ATTACHED CELLS AND WE WERE ABLE TO CUT THE DNA PRECISELY HERE AND OBTAINED THE SHORT DNA FRAGMENT OF ABOUT 50 BASE PAIRS FROM EACH NUCLEOSOME. INSTEAD OF OBTAINING THE FULL FOOTPRINT, WE OBTAINED ONLY SHORT FRAGMENTS OF 50 BASE PAIRS BUT THEIR CENTERS WILL INDICATE THE POSITION OF THE NUCLEOSOME DIETS. THIS IS THE DISTRICT WE OBTAINED N THE CELLS HERE IN THIS MAP. EVERY ROW REPRESENTS A GENE. WE HAVE ABOUT 5,000 ROWS. EACH REPRESENTS A GENE AND I ALIGNED ALL THE GENES AT THE POSITION OF THE PLUS-ONE WHICH IS THE FIRST NUCLEOSOME ON THE GENE'S BODY AND SORTED THE GENES ACCORDING TO THE TYPICAL DISTANCE BETWEEN THE PLUS 1 AND PLUS 2 NUCLEOSOME. SO BETWEEN THE FIRST TWO FOUND ON THE GENE BODIES. AND AS YOU CAN SEE, WE OBTAINED VERY HIGH RESOLUTION IN THIS MAP. WE CAN EVEN SEE THE NUCLEOSOME POSITION SEPARATED BY 10 BASE PAIRS. AS A COMPARISON, THIS IS THE RESOLUTION YOU OBTAIN WITH A COMMON METHOD AND THIS IS METHOD OBTAINED WITH THE EXPERIMENT WHERE YOU DON'T USE MNAs TO BREAK THE CHROMATIN. THE EXPERIMENT HAS A HIGHER RESOLUTION IN MAPPING SINGLE NUCLEOSOME. BUT THIS IS NOT THE ONLY INFORMATION WE OBTAINED IN THIS EXPERIMENT. SO IF WE LOOK AT THE DISTRIBUTION OF ALL THE FRAGMENTS OBTAINED AFTER THE REACTION, WE SEE WE OBTAINED OTHER FRAGMENT SIZES. LONGER FRAGMENT. WE CAN SEE WE HAVE THREE GROUPS OF FRAGMENTS HERE. WHAT ARE THE FRAGMENTS AND WHERE DO THEY ORIGINATE FROM? WELL, I TOLD YOU UNTIL NOW I ONLY TOLD YOU ABOUT THE SHORT FRAGMENTS OF 50 BASE PAIRS WHICH ARE INDICATED HERE IN THIS STRONG PEAK. EACH HAS THREE SO THESE ARE THE SHORT FRAGMENTS BUT WE ALSO OBTAIN FRAGMENTS BETWEEN THE NEIGHBORS NUCLEOSOMES AND THE SECOND GROUP OF FRAGMENTS RESPONDING TO THE FRAGMENTS BETWEEN THE TWO NUCLEOSOMES AND SOME ARE NOT NOT OBTAINED FROM LONGER SITES LIKE THE ONE SHOWN HERE. DURING THE REACTION IT TURNS OUT THERE'S A WAY I WORKS YOU ALWAYS HAVE TO PREPARE THE LIBRARY. AND THERE'S INTERMEDIATE FRAGMENT AND RESPONDING TO FOUR BASE PAIRS AND PAYING THE SAME VALUE FOR CROSS CORRELATION BETWEEN THE INTERMEDIATE FRAGMENTS AND THE LEFT HAND OF THE SHORT FRAGMENT. WE FOUND THAT THERE'S A MISSING GAP OF FOUR BASE PAIRS. NOW, IF YOU KNOW THE SIZE OF THE GAP OF FOUR BASE PAIRS AND SINCE WE DIRECTLY MEASURE THE SIZE OF THIS LONG FRAGMENT WHICH COMES FROM HERE TO HERE, IF WE ADD THIS TOO WE OBTAINED A DIRECT MEASUREMENT OF THE NUCLEOSOME AND THIS FRAGMENT AND THE GAP WILL GIVE US EXACTLY THE SO-CALLED NUCLEOSOME. AND JUST MEASURING SO LOOKING AT THE LINE DISTRIBUTION OF THE FRAGMENT WE OBTAINED WE SEE THE LENGTHS HAVE THIS DISTRIBUTION AND HAVE A MAXIMUM CORRESPONDING TO THE VALUES AND THESE ARE THE PREFERRED SIZES, 151 AND SO ON. IF WE HAVE THE NUCLEOSOME REPEAT AND IF WE SUBTRACT OF THE NUCLEOSOME WE CAN LOOK BE AT THE NEIGHBORS ONES. SO JUST TRACKING THE SIZE OF THE NUCLEOSOME WE HAVE RESEARCHERS WITH THE LENGTHS FOLLOWING THE RULE. SO THE LENGTH OF THE LINKERS FOLLOW THE RULE OF 10 AND PLUS 5. NOW WHAT DOES THIS MEAN ? DOES IT MEAN THAT SOME GENES WILL HAVE VERY SHORT LINKERS OF FIVE BASE PAIRS WHILE OTHERS WILL HAVE LINKERS OF 15 BASE PAIRS AND IS IT EVEN A SINGLE GENE CAN HAVE A WIDE LENGTH? WITH OUR DATA IT'S EASY TO LOOK AT THIS FRAGMENT SIZE LINKED TO A SINGLE GENE AND WE SAW EVEN WHEN WE SAW IT CORRESPONDING TO A SINGLE GENE WE STILL OBTAINED THE SAME BROAD DISTRIBUTION AND EVEN A SINGLE GENE CAN HAVE A WIDE RANGE OF LINKER LENGTHS. WE KNOW FROM COMPUTER SIMULATIONS THAT EVEN A SMALL CHANGE IN THE LINKER LENGTH OF FIVE BASE PAIRS CAN INDUCE A BIG CHANGE IN THE ORGANIZATION OF THE FIBER. SO IT CAN HAVE A DIFFERENT CHROMATIN ORGANIZATION IN THE POPULATION OF CELLS. WE ALSO LOOKED AT THE CORRELATION BETWEEN THE NUCLEOSOME SPACING AND THE DIFFERENT OTHER PROPERTIES FOR EXAMPLE, I LOOKED AT THE SPACING AND WA -- WHAT WE SAW IS THE SPACING CORRELATES WITH THE H1 HISTONE PRESENT IN THE LINKERS AND CORRELATES WITH THE VARIANT WHICH IS INCORPORATED IN THE PLUS 1 HISTONE OF THE GENE. AND IT CORRELATES WITH THE SIZE OF THE SO-CALLED REGION THE SIZE OF THE GENE PROMOTER AND CORRELATES WITH THE TRANSCRIPTION LEVELS. WE LOOKED AT DATA FOR DIFFERENT SUBUNITS. ALSO AN ACTIVE MARK AND WE LOOKED AT THE BINDING TO THE PROMOTERS. BASICALLY THE CHROMATIN ORGANIZATION CORRELATES WELL WITH THE DIFFERENT MEASURES OF THE TRANSCRIPTION LEVEL OF THE GENES. THIS IS A SUMMARY OF THE CORRELATION BETWEEN ALL THE PROPERTIES. YOU CAN SEE EASILY THE CORRELATION COEFFICIENTS ARE HIGH CLOSE TO ONE OR ALMOST MINUS ONE FOR THE CASE WHERE THEY DON'T CORRELATE. SO WE HAVE A PRECISE METHOD OF MAPPING GENOMES AND THE NEXT STEP IS TO UNDERSTANDING THE MECHANISMS THAT CREATES THE ORGANIZATION AND TO BUILD A RIGOROUS MODEL THAT CAN PREDICT THE NUCLEOSOME ORGANIZATION THAT IS OBTAINED IN EXPERIMENTS. LET'S LOOK AT THE ORGANIZATION AND SEE HOW WE CAN EXPLAIN THIS. SO IN YEAST IT TURNS OUT THE ORG IS SIMPLE COMPARED TO HIGHER ORGANISMS. BASICALLY, THE WHOLE GENOME IS COVERED BY NUCLEOSOMES WITH ONLY SMALL GAPS AT THE GENE PROMOTERS. LET'S LOOK MORE CAREFULLY AT THE GENE END TO UNDERSTAND WHAT'S HAPPENING AT THE GENE END. FOR THIS I ALIGNED ALL THE YEAST GENES AT THE ENDS AND THEIR THREE ENDS AND TOOK INTO ACCOUNT THE ORIENTATION OF THE NEARBY GENE. I SPLIT THE SET OF DIVERGENT GENES IF A LINE THEM AND SPLIT THE SET OF CONVERGENT GENES FROM THE INDEPENDENT GENES IF I ALIGNED THEM. DOING THIS ALIGNMENT FOR THEM WE OBTAINED THIS PATTERN OF ORGANIZATION. YOU CAN CLEARLY SEE -- HERE EVERY ROW REPRESENTS A YEAST GENE AND THE RED COLORS INDICATE THE BLUE COLOR INDICATES THE GENE PROMOTERS. WE SEE THE GENE PROMOTERS ARE NUCLEOSOME DEPLETED. YOU CAN SEE THE VERTICAL SKRIEPZ ON THE EN-- STRIPES AND YOU SEE THEY HAVE THE SAME ARRANGEMENT OF NUCLEOSOMES. WE SAY THE SNEEK LEE -- NUCLEOSOMES ARE PHASED. NOW, WE WANT TO UNDERSTAND WHAT IS GENERATING THIS PATTERN OF NUCLEOSOME ORGANIZATION. SO SOMETHING IMPORTANT HERE TO NOTICE IS THAT WE HAVE THIS BLUE STRIPES INDICATING NUCLEOSOME AT THE ENDS OF THE GENES BUT AT THE FRONT ENDS OF THE GENES, THERE IS NO NUCLEOSOME DEPLETED REGION UNLESS THE NEARBY GENE HAS ITS PROMOTER OVERLAPPING WITH THE FRONT END SO IF YOU LOOK ONLY AT THE ALIGNMENTS OF THE FRONT ENDS OF THE GENES WHICH ARE AWAY FROM ANY OTHER PROMOTER, THEN YOU DON'T SEE A NUCLEOSOME REGION. NOW, WHAT GENERATES THIS ORGANIZATION? WELL, WE KNOW THAT IN THE CELLS THERE'S OTHER PROTEINS AND MANY ARE FOUND AT THE GENE PROMOTERS. THERE ARE OTHER DNA FINDING PROTEINS FOR THE GENE PROMOTERS. IF WE HAVE HISTONE PROTEINS THAT LAKE TO BIND TO THE PROMOTERS THEY'LL CREATE A HISTONE BYPRODUCTING THERE. BINDING THERE AND USING STATISTICAL MECHANICS IT'S EASY TO COMPUTE THE PREDICTED NUKE NUCLEOSOME DISTRIBUTION IN THE ENERGY LANDSCAPE WHERE HAVE YOU A SIMPLE ENERGY BARRIER. WE CAN COMPUTE THE DISTRIBUTION IF WE ASSUME EVERY GENE HAS AN ENERGY BARRIER AND I'LL SKETCH THE WAY WE COMPUTE THIS. I HOPE I WON'T BORE YOU TOO MUCH WITH THE PHYSICS. IF HAVE YOU A BINDING SITE WHICH CAN BE BOUND BY A PROTEIN OR SYSTEM. IT'S EASY TO FIND THE BASICALLY, IF YOU HAVE THE TWO ENERGY LEVELS CORRESPONDING TO THE TWO CONFIGURATIONS OF THE SYSTEM THEN THE PROBABILITY OF FINDING THE SYSTEM IN THIS STATE IS JUST A WAY TO LOOK AT THE WEIGHTS. IN OUR CASE IT'S MORE COMPLICATED. WE HAVE A HUGE ONE DIMENSION CORRESPONDING TO A CHROMOSOME SO WE HAVE ONE HERE AND NUCLEOSOMES WHICH CAN FEED ON THIS. AND MOREOVER, DIFFERENT CELLS WILL HAVE DIFFERENT CON CONFIGURATIONS AND THE POSITIONS CAN HAVE DIFFERENT BINDING ENERGIES DEPENDING ON THE DNA SEQUENCE. IT'S A LITTLE MORE COMPLICATED BUT STILL THE PROCEDURE OF COMPUTING THE PROBABILITIES OF HAVING IT IN EVERY STATE IS THE SAME. IF WE CAN LOOK AT THE ENERGY CORRESPONDING TO THE CONFIGURATIONS THE PROBABILITY OF FINDING THE SYSTEM IS ONE FACTOR LIKE THIS. AND WE CAN CALL IT THE PARTITION FUNCTION. AND IF WE HAVE THIS, WE CAN COMPUTE ALL OTHER PROPERTIES OF THE SYSTEM. WE'RE INTERESTED TO COMPUTE AND THE PROBABILITY OF HAVING THIS NUCLEOSOME AT POSITION I. WE NEED TO ADD THE PROBABILITIES OF THE CORRESPONDING STATES AND THE PROBABILITY WHERE THE Z MINUS AND Z PLUS ARE AGAIN THE SAME PAR TIS FUNCTION AND BASICALLY IT CAN BE SOLVED USING DYNAMIC PROGRAMMING. BASICALLY THERE ARE TWO RELATIONS TO THE PARTITION FUNCTION WE NEED TO COMPUTE THE CON F CONFIGURATIONS IN THE BOX. THERE ARE TWO TYPES IF YOU LOOK AT WHAT'S HAPPENING AND THE CONFIGURATIONS ARE INCLUDED IN THIS FACTOR USING THE RELATIONS AND WITH THE SIMPLE BOUNDARY CONDITIONS WHICH BASICALLY SAY THE BOX IS SO SMALL THIS ONE EQUALS ONE. WE CAN EASILY COMPUTE THE DISTRIBUTION OF THE PARTICLES IN ANY ENERGY LANDSCAPE. AND YOU CAN SOLVE THE INVERSE PROBLEM WHICH MEANS TO INFER THE ENERGY LANDSCAPE IF YOU LOOK AT THE DISTRIBUTION OF PARTICLES. WE EXTENDED THIS SIMPLE MODEL IT'S EASY TO PREDICT THE NUCLEOSOME IN THE LANDSCAPE WHICH IS A FLAT ENERGY LANDSCAPE WITH ENERGY BARRIERS AT THE GENE PROMOTERS. AND FROM THERE WE SEE THE PARAMETERS OF THE MODEL WHICH ARE PROPERTIES OF THE ENERGY PROMOTERS AND THE NUCLEOSOMES FROM THE SYSTEM. AFTER WE SEE THE PARAMETERS WE CAN COMPUTE OR PREDICT THE GENOME WIDE NUCLEOSOME DISTRIBUTION ON THE OTHER CHROMOSOMES. DOING THIS WE MAINTAIN THE PREDICTED NUCLEOSOME DISTRIBUTION. ON THE LEFT I ASSIGNED ALL THE YEAST GENES AT THE ENDS AND CLEARLY WE HAVE GOOD PREDICTION USING THE MECHANICS MODEL WHICH BASICALLY DOESN'T INCLUDE THE CONTRIBUTIONS OF THE DNA SEQUENCE. AND THERE'S BARRIER COMPLEXES WHICH BIND. THE PREDICTION OF THE MODEL IS THAT WHENEVER YOU HAVE THESE THEY'LL LOOK AT THE DEPLETED REGIONS AND WHILE WE HAVE GENE PROMOTERS SO WE WANT A NUCLEOSOME DEPLETED REGION AND IF WE HAVE THE AVERAGE DISTRIBUTION WE WON'T SEE THE NICE PATTERNS SO THE NUCLEOSOMES WILL NOT BE IN PHASE. WELL, THIS IS A NICE MODEL AND PREDICTION BUT UNFORTUNATELY ALL THE YEAST GENES LOOK LIKE THIS. UP GENES WE JUST HAVE ONE GENE THAT IS SILENT AND 5,000 WHICH ARE MORE OR LESS ACTIVE. AND WE HAVE TO LOOK AT THE COMPLEX FROM A GENE PROMOTER AND TO SEE IF THE NUCLEOSOMES START TO SHIFT AND THE REGION DISAPPEARS. UNFORTUNATELY WE DON'T KNOW THE PRECISE IDENTITY OF THIS PROTEIN SWITCHED TO THE GENE PROMOTER. LIKELY THEY'RE ESSENTIAL FOR THE CELLS. SO IF WE MOVE THE FACTORS THAT ARE FOUND IN THE GENE PROMOTERS IT'S LIKELY THE CELLS WILL DIE. WE CAN TEST THIS. SAY HAVE YOU HALF THE GENES IN EVERY CELL TYPE ARE ACTIVE AND HALF ARE SILENT IN ANY CELL TYPE. SO THEN WE LOOK, FOR EXAMPLE, THE GENES LOOKING AT THE EXPERIMENT OR DNA SEEK EXPERIMENT IT'S EASY TO IDENTIFY THE PROMOTERS BOUND BY THESE COMPLEXES BECAUSE THOSE WILL BE HYPERSENSITIVE AND IT'S EASY TO IDENTIFY THE GROUPS OF GENES SEPARATE THEM BY THE GENES BOUND BY THE EXTRA PROTEINS. THEN THE PREDICTION OF MY MODEL IS FOR THIS GENE WE FIND THE NUCLEOSOME DEPLETED REGIONS AS THE GENE PROMOTERS AND FOR OTHER GENES NOT BOUND BY THE PROTEINS, THEY NUCLEOSOMES WILL BE MORE DISORGANIZED AND YOU WON'T SEE NUCLEOSOME PHASING. IF WE LOOK THE NUCLEOSOME ORGANIZATION THOSE THAT HAVE BARRIER COMPLEXES TO THEIR PROMOTERS WHEN WE LOOK AT THE DISTRIBUTION THEY HAVE A NUCLEOSOME NUCLEOSOME DEPLETED REGION AND WE SEE IT ON THE PHASING WHILE ON THE GENES WHICH DIDN'T HAVE THE ACCESSIBLE REGIONS HAVE THE BARRIER COMPLEXES PRESENT, THERE THE NUCLEOSOMES ARE MORE DID DDID -- DISORGANIZED AND WE DON'T SEE PHASING. AGAIN, THIS WAS THE CASE FOR A CELL TYPE OF FLIES BUT WE SEE THE SAME SITUATION IN OTHER CELL TYPES FROM MOUSE AND FROM HUMAN. LET ME SUMMARIZE. SO WE DEVELOPED A NEW METHOD OF MAPPING PRECISELY AND AS THE LINKERS ORIGINATING FROM PAIRS TO NEIGHBORING FROM THE SAME CELL WE SHOWED INDIVIDUAL GENES CAN HAVE A WIDE RANGE OF NUCLEOSOMES WHICH SUGGEST THAT IN A POPULATION OF CELLS EVEN SINGLE GENES CAN HAVE A WIDE RANGE OF CHROMATIN FIBER ORGANIZATION AND IN THE LESS PARTS OF MY TALK I SHOWED YOU THAT A SIMPLE BIOPHYSICAL MODEL IS ABLE TO EXPLAIN THE NUCLEOSOME ORGANIZATION AT THE GENE PROMOTER. AND ABLE TO EXPLAIN THE NUCLEOSOME PHASING BETTER THAN OBSERVED NEAR THE GENE PROMOTERS. EVEN THOUGH WE ONLY PUT THE ENERGY BARRIERS AT THE VIBRANT END OF THE GENES WE FOUND A GOOD PREDICTION AT THE THREE FRONT ENDS OF THE GENES. IN THE MAPS THE LEFT AND RIGHT SIDE HAVE THE VERY GOOD COORDINATION WITH THE REAL DATA. I WOULD LIKE TO THANK ALL MY COLLABORATORS WHO PROVIDED ME WITH ENOUGH DATA TO PLAY WITH AND I WOULD LIKE TO THANK THE BIOWULF TEAM FOR PROCESSING THE DATA AND IF YOU WANT TO HEAR MORE ABOUT MY WORK, THIS IS HOW TO GET IN TOUCH WITH ME AND THANK YOU FOR YOUR ATTENTION. I'LL BE HAPPY TO ANSWER ANY QUESTIONS. [QUESTION INAUDIBLE] >> THE QUESTION WAS IN THE MAMMALIAN CELLS WE USED THE CHEMICAL MAPPING. NO, IT WAS MNA DATA. IT'S VERY GOOD TO DETECT THE POSITIONS OF THE NUCLEOSOMES. THE ADVANTAGE IS THE PROFILES ARE MORE EVEN SO WE DON'T SEE THIS VARIATION. BUT IT'S PERFECTLY VALID TO MAP THE NUCLEOSOMES AND THAT'S WHAT WE'VE USED. [QUESTION INAUDIBLE] . >> THE QUESTION WAS HOW MANY CELLS WE USED. IT WASN'T A SINGLE CELL TECHNIQUE FOR OPTIMIZED FOR THAT. ALL THE EXPERIMENTS WERE DONE AND I DON'T KNOW THE ORDER OF MAGNITUDE FOR THE NUMBER OF CELLS BUT I THINK IT'S COMPARABLE TO THE NUMBER OF CELLS USED IN AN MNA EXPERIMENT SO NOT A SINGLE CELL ESSAY AT THIS POINT. WE SOLVED THE DIRECT PROBLEM. WE USED THE ENERGY LANDSCAPE TO PREDICT THE NUCLEOSOME PREDICTION FOR THE FIRST YEAST CHROMOSOME AND THEN WE ADJUSTED THE PARAMETERS OF THE MODEL SUCH THAT THE PREDICTED DISTRIBUTION ON CHROMOSOME 1 -- THE INPUT WAS THE REAL DISTRIBUTION ON THE CHROMOSOME 1 AND WE ADJUSTED THE MODEL. THIS WAS THE TRAINING DATA AND THEN WE TESTED USING THE WHOLE OTHER CHROMOSOMES. [QUESTION INAUDIBLE] >> IT WORKS IF YOU USE CHEMICAL MAPPING DATA. AS A SAID BEFORE, THE ONE ADVANTAGE OF THIS CHEMICAL MAPPING IS AND YOU CAN HAVE THE PROFILES MORE EVEN BUT THE POSITIONS ARE WELL DETERMINED EVEN WITH THE MNA SINK DATA. [QUESTION INAUDIBLE] >> THE QUESTION IS IF THERE'S AN ISSUE WITH THE [INDISCERNIBLE] . THE GOOD THING ABOUT THIS METHOD IS THAT IF YOU INCREASE THE EFFICIENCY OF CLEAVAGE, YOU ONLY IMPROVE THE DATA YOU RECOVER FROM THIS EXPERIMENT BECAUSE IF YOU INCREASE THE EFFICIENCY OF CUTTING THE DNA YOU WILL ONLY MAKE THE DNA AT THE HISTONES SO YOU GET MORE AND MORE DATA BUT YOU DON'T GET THE PROBLEM WITH THE MNA SEEK EXPERIMENT. YOU DON'T HAVE THE NUCLEIC ACTIVITY SO YOU DON'T DESTROY DNA. THE MORE YOU DIGEST OR CLEAVE THE DNA WITH THE CHEMICAL APPROACH, THE MORE DATA YOU HAVE SO YOU HOPE TO INCREASE THE EFFICIENCY AS MUCH AS POSSIBLE TO REDUCE EVERYTHING TO SHORT FRAGMENTS OF 50 BASE PAIRS. SO THE QUESTION IS WHETHER THE DNA CONSEQUENCE PLAYS A ROLE IN THE POSITIONS OF THE [INDISCERNIBLE] . THAT'S TRUE SO IN THIS SIMPLE MODEL WE DIDN'T INCLUDE ANY EFFECTS FROM THE DNA SEQUENCE SO WE WERE ABLE TO GENERATE THE OVERALL OR THE COMMON PATTERNS OF NUCLEOSOMES NEAR THE PROMOTERS BUT -- SORRY, I FORGOT MY THOUGHT. THE DNA CONSEQUENCE PLAYS AN IMPORTANT ROLE TO FIX THE PRECISE POSITION OF THE PLUS NUCLEOSOME. IN OUR PAPER WE SHOWED WHEN AWAY LINED ALL THE PLUS 1 NUCLEOSOMES WE SAW A NICE PATTERN WHICH COORDINATES VERY WELL WITH THE ALTERNATIVE POSITIONS OF THE PLUS 1 NUCLEOSOME. AT THE SINGLE NUCLEOSOME LEVEL THE CONSEQUENCE PLAYS A MAJOR ROLE IN FIXING THE PRECISE POSITION OF THE NUCLEOSOME. IT'S ONLY AT THE GLOBAL LEVEL IF OU JUST WANT TO SEE THE NUCLEOSOME PHASING PATTERN, AT THE GLOBAL LEVEL, THE DNA SEQUENCE PLAYS A REDUCED ROLE AND MORE IMPORTANT ROLES SO THIS IS THE BARRIER COMPLEX AND MORE IMPORTANT ROLE AS THE GLOBAL LEVEL. YES [QUESTION INAUDIBLE] >> THE QUESTION IS IF THIS MODEL CAN SOMEHOW PREDICT ANYTHING ABOUT TRANSCRIPTION, I GUESS. WELL, AT THIS STAGE THIS MODEL -- AS I SHOWED YOU HERE, DOESN'T TAKE INTO ACCOUNT THE DNA SEQUENCE NOR TRANSCRIPTION LEVELS. IT ONLY TAKES INTO ACCOUNT GENE PROMOTERS ARE BOUND BY AN EXTRA PROTEIN. NOW, YOU CAN INCLUDE THE TRANSCRIPTION LEVEL IN THIS MODEL BY SAYING THE GENES WHICH ARE MOST TRANSCRIBED HAVE THE HIGHEST FREQUENCY OF TRANSCRIPTION INITIATION MACHINERY WHICH IS PROBABLY THE BARRIER COMPLEX. SO IT COULD MODEL OR INCLUDE THE TRANSCRIPTION INTO THE MODEL BY ASSUMING THE DIFFERENT BARRIER COMPLEXES SO THE ENERGY BARRIER WILL HAVE DIFFERENT PROMOTERS BUT WE DIDN'T DO THAT BUT I THINK THAT WOULD BE THE EASIEST WAY TO INCLUDE SOME INFORMATION FROM THE TRANSCRIPTION INTO THE MODEL. RIGHT NOW THE ONLY DIFFERENCE I SHOWED YOU WAS THAT -- I SAID THAT THE GENES WHICH ARE NOT BOUND -- THE PROMOTERS NOT BOUND BY THE BARRIER COMPLEXES ARE THE SILENT GENES AND OTHER GENES ARE THE ACTIVE GENES BUT NOW OF COURSE, WHEN YOU LOOK AT THE ACTIVE GENES, DIFFERENT PROMOTERS WILL BE BOUND BY DIFFERENT ACTIVATORS MAYBE. THANK YOU.