THANK YOU VERY MUCH FOR PARTICIPATING IN THIS PRESENTATION. WHETHER YOU'RE PHYSICALLY HERE OR LISTENING ON THE WEBCAST. MY NAME IS YAFFA RUBENSTEIN AND TOGETHER WITH MY COLLEAGUE HELEN MOORE, WE'RE COLLECTING THE BIOSPECIMEN INTEREST GROUP WHICH IS SUPPORTED BY THE OFFICE OF -- RESEARCH, A COMPONENT OF NCAST AND BY THE OFFICE OF BIOSPECIMEN -- AT THE NCI. TODAY'S TALK IS OF THE PRESENTATION IS GOING TO BE -- AND WE HAVE THREE GREAT SPEAKERS WHICH ARE GOING TO ADDRESS ISSUES OF DOING THE RESEARCH ON THE DISEASES BIOSPECIMEN RESEARCH AND CONNECTING DATA AND SUBJECT -- DATA SHARING. FIRST SPEAKER IS GOING TO BE DR. FALK. JUST A LITTLE BIT FEW THINGS ABOUT DR. FALK. HE'S AN MD AND ASSISTANT PROFESSOR IN THE DIVISION OF HUMAN GENETICS DEPARTMENT OF CHILDREN'S HOSPITAL IN PHILADELPHIA AND PENNSYLVANIA -- SCHOOL OF MEDICINE SINCE 2006. SHE'S A BONE FIDE CLINICAL DATA CLINICIAN -- SHE DEVELOPED IN FULL THE DIAGNOSTIC FROM THE -- DISEASES. INCLUDING SEQUENCING, DR. FALK IS PI OF NIH FUNDED RESEARCH LABORATORY -- THAT INVESTIGATED THE METABOLIC CONSEQUENCES OF THE DISEASE AND TARGETED FROM -- THERAPIES IN MOUSE AND TISSUE OF GENETIC -- DR. FALK HAS PUBLISHED MORE THAN 40 ARTICLES IN THE AREA OF HUMAN GENETICS AND -- DISEASE. SHE'S A MEMBER OF THE -- GENETICS MEDICINE, FOUNDER AND COLEADER OF THE RESEARCH FACILITY GROUP AND HAS MORE THAN, THAT HAS MORE THAN 100 PARTICIPANTS, AND SHE'S THE CHAIR ELECT OF THE SCIENTIFIC AND MEDICAL ADVISORY BOARD AS WELL AS MEMBER OF THE BOARD OF -- UNITED MEDICAL -- DISEASE FOUNDATION. I'M DELIGHTED TO INVITE DR. FALK TO PRESENT HER TALK. THANK YOU SO MUCH FOR HAVING ME HERE. IT'S AN HONOR TO BE HERE AT THE NIH. I MENTIONED THAT I WAS ABLE TO DO A FELLOWSHIP HERE IN THE 90'S AND IT'S CHANGED SO MUCH AND IT'S SUCH AN HONOR TO BE HERE TODAY. SO I'M GOING TO GIVE YOU A BRIEF OVERVIEW OF MITOCHONDRIAL DISEASE. WE'LL TALK ABOUT WHAT IT MEANS TO HAVE MITOCHONDRIAL DISEASE AND HOW YOU WIND UP WITH THAT DESIRE AND WHAT WE CAN DO TO FIGURE IT OUT AND HELP YOU THERAPISTALLY AND HOW WE CROSS THE LINE IN THE INTEREST OF THIS GROUP TO OBTAIN TISSUES FOR HUMAN SUBJECT RESEARCH. SOME OF THE CHALLENGES AND POSSIBILITIES ARE DOING THAT. AND THEN TALKING WITH YOU A LITTLE BIT ABOUT WHAT TYPE OF TRANSLATIONAL RESEARCH WE DO WITH THE HUMAN BIOSPECIMENS IN OUR OWN LABORATORY. SO TO START, WE ALL KNOW WAY BACK WHEN, IF YOU KNOW ANYTHING ABOUT MITOCHONDRIAL THAT THAT ARE -- ORGANELLES. THEY AROSE BY ANCESTORS AT ABOUT TWO BILLION YEARS AGO WITH MULTICELLULAR LIFE TO E HAVE ALL -- AND REGULATE MANY CELLULAR FUNCTIONS. MUTATIONS KNOW THAT MITOCHONDRIA HELPED TO MAKE ENERGY SO THEY ARE IN ENERGY DEFICIENT STATE. FOR PATHOPHYSIOLOGIC PERSPECTIVE MITOCHONDRIA HAVE A ROLE IN MANY DISEASES INCLUDING -- HOME STAIRKSZ APOPTOSIS, THE GENERATION OF SCAF ENGINING OF THE REACTION SPECIES, PRODUCING STEROIDS AND REALLY THE LEADER OF THE ORCHESTRA IF YOU WILL OF METABOLISM. THERE ARE SO MANY DIFFERENT ASPECTS OF WHAT MITOCHONDRIAL FUNCTION MEANS WHICH IS WHY THE DISEASES THEMSELVES ARE SO HETEROGENEOUS. THEY SAID THAT MITOCHONDRIA DISEASE COULD BE ANY SYMPTOM AT ANY ORGAN AT ANY AGE, PRETTY MUCH THE SOWRM OF WHAT I'M GOING TOOB -- SUMMARY OF WHAT I'M GOING TO BE TELLING YOU. I THINK NEWER TECHNOLOGIES ARE HELPING US FIGURE IT OUT. UNFORTUNATELY THERE HAS BEEN NO COMMON BIOMARKER IN MITOCHONDRIAL DISEASES SPEAKING FROM GENETIC PERSPECTIVE HAS BEEN KNOWN IN HUNDREDS OF DIFFERENT MEDICAL DISORDERS SO YOU CAN UNDERSTAND WHY ONE BIOMARKER MAY NOT BE ABLE TO SPEAK FOR THE WHOLE GROUP ALTHOUGH IT MAY SPEAK FOR SOME OF THE SUBGROUPS. IT'S BOTH RECOGNIZED BOTH GENOMES ARE IMPLICATED -- MORE THAN 85 GENES, DEPENDS HOW YOU CLASSIFY IT, HOW YOU MEAN IT IN THE GENOME. MITOCHONDRIAL DISEASE CAN BE ANYTHING. IT CERTAINLY CAN RED FLAG FINDINGS AND IT OFTEN INVOLVES NEUROLOGIC FEATURES. SO BASAL GANGLIA DISEASE, CERTAIN PATTERN OF SEIZURES AND STROKES, HEART MUSCLE AND HEART WRISTAL ABNORMALITIES, DIFFERENT FORMS OF VISUAL IMPAIRMENT INVOLVING THE IMUSSALS OF THE OPTIC NEVER, THE RETINA. REALLY MANY THE EYELIDS, GI MOTILITY AND VERY CLASSIC PATTERNS OF EXERCISING PATTERNSOR ONSET OF PHENOTYPES. WHEN YOU SEE THESE YOU HAVE A GOOD FEELING IT'S MITOCHONDRIAL DISEASE BUT THERE'S A ROLE OF NON-SPECIFIC FINDINGS THAT ARE HIGHLY PREVALENT IN MITOCHONDRIA DISEASE. THIS IS JOHNS EXAMPLE OF THE NEUROLOGISTS IN THE ROOM OF WHAT THE CLASSIC SYNDROME LOOKS LIKE WITH HYPER INTENSE TEASE IN THE BASAL GANGLIA AND THERE ARE ACTUAL BIOCHEMICAL THAT YOU CAN DO BY BRINGING IMAGING WHICH IS MAGNETIC RESIDENCE SPECTROSCOPY. WHEN YOU SEE THIS YOU HIGH SUSPICION THAT THE MITOCHONDRIA ARE NOT WORKING. SO WHEN THE FIELD SAYS PRIMARY MITOCHONDRIAL DISEASE WHAT IS REALLY BEING DISCUSSED IS THE FUNCTION OF THE RESPIRATORY CHAIN AT THE VERY END OF THE PROCESS OF MAKING ENERGY. AND SO WHAT'S HAPPENING THE FOOD YOU EAT, AFTER LUNCH SO HOPEFULLY EVERYBODY HAD LUNCH AND RIGHT NOW YOU'RE METABOLIZING IT -- THAT'S IN THE FORM OF -- THIS IS A BREAK DOWN OF YOUR FAT, YOUR SUGAR, YOUR PROTEINS. AND THAT'S CREATING ELECTRONS IF YOU'RE ENTERING COMPLEX ONE OR TWO TO COENZYME Q WHICH LIVES IN THE MITOCHONDRIAL MEM DRAIN -- CYTOCHROME C WHICH IS THE BIG MACHINE OF APOPTOSIS TO COMPLEX 4 AND ULTIMATELY OXYGEN IS THE FINAL ELECTRON ACCEPTOR. IN THE PROCESS IS GENERATING A PROTEIN GRADIENT THROUGH THESE COMPLEXES WHICH THEN CREATES ENERGY WITHIN THE MITOCHONDRIA IN THE FORM OF ATP. SO IT'S REALLY THIS PROCESS WHERE EACH OF THESE ARE MULTISUBUNIT COMPLEXES. THIS ONE ALONE HAS 45 SUBUNITS. THIS HAS FOUR, ETCETERA. AND SO ANY OF THE GENES REALLY THAT ARE INVOLVED IN ENCODING THE UNITS OF THIS PROCESS, THE ASSEMBLY OF IT, ALLOWING THIS TO FUNCTION PROPERLY. THIS IS WHAT CONSTITUTES PRIMARY MITOCHONDRIAL DISEASE. AND SO AUSTRALIA HAS THE DISTINCT PRIVILEGE OF BEING ONE CONTINENT AND ONE MITOCHONDRIAL CLINIC FOR THAT CONTINENT. THEY HAVE GROUP LOCATIONS AND THESE ARE GRACIOUSLY SHARED FROM DOCTOR -- WHO RUNS THE MITOCHONDRIAL CENTER. THEY HAVE OVER 450 KIDS WITH REST RESPIRATORY FUNCTIONS. 20 PERKS PERCENT OF THEM HAVE PATIENTS IN SEVEN MITOCHONDRIAL GENES ANOTHER QUARTER HAVE MUTATIONS IN 23 NUCLEAR GENES AND THE OTHER HALF AT THE TIME IN 2010 REALLY BEFORE SEQUENCING, NO IDEA WHY THERE WAS RESPIRATORY CHANGE FUNCTION. IT'S BIOCHEMICAL CLUE WHEN YOU'RE STUDYING HOW IT'S BEING MADE IN THE TISSUE OF THE PERSON BUT NOT GIVING YOU THE CAUSE. THE MOST COMMON CAUSE BECAUSE IT'S THE LARGEST SUBUNIT OF MITOCHONDRIAL UNIT BUT REALLY ANY TISSUE CAN BE VOVMENTD THE OTHER POINT TO TAKE AWAY FROM MITOCHONDRIAL DISEASES IT'S HIGHLY LOCUS HETEROGENEOUS. BOTH GENOMES CAN CAUSE MITOCHONDRIAL DISEASE WORKING FOR THE -- WE TABULATED IN 2008 THAT MUTATIONS WERE REPORTED IN 34 OF THE 37 MITOCHONDRIAL AND ABOUT 60 OF THE NUCLEAR GENES BY 2010 THIS NUMBER HAD INCREASED TO ALL THE MIGHT CHONDRAL GENES AND ABOUT 80 OF THE NUCLEAR GENES, AND THIS IS DOCTOR -- SUBGROUPED THEM, WHAT ARE THE GENES INVOLVED WITH FOR THE REPSZ TREE GENE SUBUNIT IN MAKING THE REGULATORY CHANGE GETTING THE MITOCHONDRIAL DNA MAINTAINED AND REPLICATED BRINGING THE NUCLEOTIDES IN AND THE DYNAMICS, ALLOWING EVERYTHING TO WORK CORRECTLY OR SOME OF THE MAJOR SUSPECTS OF THE MITOCHONDRIAL DISEASE. EVERY -- ABOUT MOST OF THE CASES OF RESPIRATORY CHANGE DEFICIENCY WERE AUTOSOMAL RECEPTIVE. WHEN THE CAUSE IS IDENTIFIED BUT IT COULD BE DOMINANT -- IF YOU JUST LOOK AT INFANTILE ONSET SYNDROME MOST ARE RECESS BUY BUT IT COULD BE -- MITOCHONDRIAL DNA. IT IS A GREAT MICROCOSM OF HUMAN MAN DALLIAN DISEASE. WHAT THIS MEANS FOR THE PATIENTS WHEN YOU OF COURSE OUT WHAT THEY HAVE IT'S EVERYTHING. IF YOU DON'T KNOW THE CAUSE, ONE YOU CAN'T POSSIBLY KNOW HOW TO TREAT THEM BUT YOU CAN'T TELL THEM IF IT'S GOING TO HAPPEN AGAIN IN THEIR FAMILY, THE RECURRENCE RISK FOR THE PARENTS OR FOR THE CHILDREN THEMSELVES. SO REALLY IT BECOMES IMPERATIVE TO START WITH, WHAT IS THE CAUSE FOR WHAT YOU HAVE. AND IF YOU SEE MITOCHONDRIAL, IS IT PRIMARY OR SECONDARY. THERE'S A LOT OF DRUGS, A LOT OF DRUGS FOR EXAMPLE THAT ARE SECONDARY OF THE RESPIRATORY CHAIN. THAT DOESN'T MEAN YOU HAVE A PRIMARY GENETIC DISORDER. IT VERY MUCH MATTERS WHETHER IT WILL HAPPEN AGAIN. IF IT'S MITOCHONDRIAL IT'S PASSED THROUGH THE MOTHER -- WHERES A DOMINANT DISEASE, EITHER CHILD COULD PASS IT ON. AND SO IN ADDITION TO THERE BEING MANY DIFFERENT GENETIC CAUSES OF MITOCHONDRIAL DISEASE, ANY ONE GENETIC CAUSE CAN BE VERY PHENOTYPICALLY HETEROGENEOUS. SO -- IS PROBABLY THE MOST FAMOUS OF THE NUCLEAR GENES -- IT ALLOWS MITOCHONDRIAL DNA TO REPLICATE AND HAS A PROVE READING FUNCTION. WHEN THERE'S MUTATIONS IN THIS ONE GENE, IT CAN CAUSE ANYTHING FROM INFANTILE ONSET DISEASE -- WHICH IS BASAL GANGLIA -- TO JUST REALLY IN OLD AGE ISOLATE EYE MUSCLE DOMINANT. IT'S NOT THAT YOU CAN SEE THE AMERICAN AND SAY IT DEFINITELY IS -- JUST TESTING FOR ONE GENE HASN'T BEEN HUGELY SUCCESSFUL. SO THIS DATA WAS SHARED WITH ME BY THE DIRECTOR OF THE LABORATORY IN BAYLOR SO WHAT THEY'VE DONE IN TERMS OF TESTING FOR PAUL GAMMA MUTATIONS. DR. -- PUBLISHED AN INTERVIEW SAYING MAYBE 8% OF PRIMARY MITOCHONDRIAL DISEASES ARE DUE -- MORE THAN 4,000 UNRELATED SAMPLES OVER THAT SEVEN YEAR PERIOD. THROUGH THAT TIME THEY HAD 137 OR 3% CONFIRMED MOLECULAR DIAGNOSIS. NO SECOND MUTATIONS IN WHAT WOULD BE A KNOWN RECEPTIVE PHENOTYPE. BEARING OF UNKNOWN SIGNIFICANCE AND THE TRUTH IS MOST PEOPLE, NO DIAGNOSIS AT ALL. SO THE PHENOTYPE BECAUSE IT'S SO VARIABLE IT'S VERY HARD TO PECK ONE GENE AND BE DIRECT. AND THIS IS REALLY THE EXPERIENCE OF I WOULD SAY, YOU KNOW, ARE THE NATIONAL IF NOT INTERNATIONAL GENOME MITOCHONDRIAL DISEASE TO DIAGNOSE THESE PEOPLE. I WANT TO SWITCH GEARS TO WHAT WE SEE IN THE CLINIC AND WHY PEOPLE WOULD PRESENT AND HOW WE GO ABOUT DIAGNOSING THEM BEFORE AND AFTER NEXT GENERATION SEQUENCING. WE INTERVIEWED PEOPLE THAT CAME TO OUR CLINIC IN 2008 AND 2011 AND RANGED IN AGE BETWEEN SIX WEEKS AND 81 YEARS OLD. SO THESE ARE NOT MUTUALLY EXCLUSIVE. MOST PEOPLE HAVE MORE THAN ONE PROBLEM BUT THE LEADING PROBLEM PEOPLE COME IS WHY THEY COME TO ANY CLINICAL GENETICIST FOR INTELLECT CULL DISABILITY. SEIZURES ARE ALSO A MAJOR INDICATION. MUSCLE TONE. HISTORY OF MITOCHONDRIAL DISEASE IS PRETTY COMMON. WHEN YOU DO ESTABLISH A DIAGNOSE YOU HAVE A WHOLE FAMILY TO FOLLOW. YOU HAVE MULTIPLE AFFECTATIONS NOT JUST ONE DISME COULD HAVE VARIABLE PHENOTYPES. MUSCLE PROBLEMS, AUTISM, LACTIC ACIDOSIS BUT REALLY ANYTHING, VISION LOSS, HEARING LOSS, GI MOTILITY. THESE ARE JUST THE LEADING INDICATIONS. AND SO HOW DID WE DO? IN TERMS OF CLINICAL GENETICS WE DID OKAY. WE DO A LOT OF TESTING TO TRY AND FIGURE OUT THE CAUSE AND IN 14% OF CASES WE DEFINITELY IDENTIFY THE ETIOLOGY FOR THEIR DISEASE AND ANOTHER 4% AT THE TIME THERE WAS A DEFINITE BIOCHEMICAL CULL ABNORMALITY PERHAPS THEY HAD 2% OF ABNORMAL ACTIVITY BUT WE DIDN'T KNOW WHY. ANOTHER THIRD OF PATIENTS HAD PROBABLY OR POSSIBLE MITOCHONDRIAL DISEASE. THIS IS BASED UPON OUR CLINICAL PHYSICIANS OR MAYBE HAVING A VARIANCE THAT WASN'T PROVEN TO CAUSE DISEASE. ANOTHER, AT LEAST A THIRD HAD NO CONDITIONS THAT WE THOUGHT WERE REALLY NOT PRIMARY MITOCHONDRIAL, THEY REALLY HAD NO EVIDENCE DESPITE SOMEBODY ELSE'S CLINICAL SUSPICION. AND ABOUT 10% OF THE TIME THEY HAD ANOTHER GENETIC DISORDER ALTOGETHER. AND SO THIS IS AN EXAMPLE OF WHAT WE'RE ABLE TO FIGURE OUT BY WHOLE MITOCHONDRIAL GENOME SEQUENCING. WHEN THE CLINIC STARTED PEOPLE WERE ONLY ABLE TO TEST FOR 12 COMMON MUTATIONS FOR THE MITOCHONDRIAL DNA, ABOUT 16KB AND NOW WE KNOW THERE'S MORE THAN 400 DIFFERENT MUTATIONS IN THE MITOCHONDRIAL DNA THAT CAN CAUSE DISEASE. THE ONES IN RED ARE THE ONES THAT WOULD HAVE BEEN PICKED UP ON THE COMMON MUTATION PANEL AND THE ONES IN BLACK ARE NOT. AND YOU CAN SEE THAT THEY ARE IN THE DNR GENES AND A LOT OF THE COMPLEX ONES SUBUNITS. MORE ONE PERSON OFTEN IN THE SAME FAMILY BUT NOT ALWAYS WITH EACH MUTATION. AND THEN MULTIPLE VARIANCE THAT APPEARED TO BE PATHOGENIC BASED UPON FREQUENCY AND WHAT'S KNOWN IN CONSERVATION AND WHAT'S KNOWN. THE WAY TO PROVE THEY ARE PATHOGENIC IS CREATE A CELL LINE TAKE THESE MITOCHONDRIAL AND PUT THEM THAT BACKGROUND. THAT'S NOT AVAILABLE CLINICALLY AND YOU HAVE TO GET SOMEBODY INTERESTED WHO HAS THE TIME AND DESIRE TO DO THAT. THAT'S WHY A THOUGHT OF THESE VARIANTS STAY UNPROVEN FOR QUITE SOMETIME. WE MADE MANY OTHER DIAGNOSIS. ONE WAS A PAUL GAMMA CHILD ONE WAS A DELETION OF THE MITOCHONDRIAL DNA. YOU COULD BE ON CHUNK OF THE MITOCHONDRIAL DNA MISSING RATHER THAN HAVING A POINT MUTATION. OTHER METABOLIC DISEASES IN THE RELATED TO CONNOR TEEN OR OTHER DISEASES SNIP MICRO ANALYSIS HAS BEEN VERY HELPFUL TO FIND LARGE ABNORMALITIES. SO THESE FIVE INDIVIDUALS -- NON-DEVELOPMENT EPILEPSY DIN DRONE BY A DELETION ON AN ARRAY DOWN TO -- INTELLECTUAL DISABILITY OR VERY COMPLEX CYTOGENETIC NUMBER ABNORMALITIES. THE PHENOTYPES BECAUSE THEY ARE SEE HETEROGENEOUS PEOPLE DON'T KNOW WHAT IT IS, IT COULD BE MITOCHONDRIAL DISEASE BUT YOU REALLY FALL INTO A CATEGORY WHERE IT COULD BE LOTS OF OTHER GENETIC CAUSES. SO WHAT I SHARED WITH YOU AND HOPE TO CONVINCE YOU OF IS THE GENE BY GENE IN THE DIAGNOSTIC ROUTE HAS LIMITED SUCCESS. I THINK IF YOU DO IT IN THE CONTEXT OF A DEDICATED CLINIC, IT IS HELPFUL. I MEAN YOU ARE ABLE TO RECOGNIZE CERTAIN CLASSIC PHENOTYPES. AND IN THE MOST IT'S MORE MITOCHONDRIAL DNA -- REALLY A COMPLEX METABOLIC LABS, USING THE ORGANIC ACID PROFILES REALLY RECOGNIZABLE IN BORNE ERRORS OF METABOLISM THAT OVERLAPS WITH MITOCHONDRIAL DISEASE. AS WELL AS TO HELP PERFORM AND INTERPRET AND GUIDE GENETIC TESTING IN ISSUE STUDIES AND WE CAN IDENTIFY OVERLAPPING CONDITIONS. I THINK YOU AGREE THIS IS TIME INTENSIVE, VERY LABOR INTENSIVE AND VERY TESTING INTENSIVE. IN THE MOST, ONE GENE CAN CAUSE 2000, $3,000 TO TEST. WHAT I CONVINCE YOU OF YOU CAN DO THE WHOLE AMOUNT FOR FIVE TO $6,000. 20 THOUSAND GENES FOR TWICE THE PRICE OF ONE GENE. SO HERE IS, WE'RE VERY EXCITED ABOUT SEQUENCING IN THIS PARTICULAR HETEROGENEOUS CLASS OF DISEASE. LIKE I MENTIONED MANY GENETIC CAUSES. MOST DISEASED GENES ARE NOT COMMON SO OF ALL OF THESE 85, ONLY FIVE HAD MORE THAN 55 PATHOGENIC VARIANTS RECORDED. PAUL GAMMA DR. WILLIAM COPELAND KEEPS THE DATABASE HERE AT NIH AND HE LISTS ALL OF THE DIFFERENT MUTATIONS ON HIS WEBSITE OF WHAT ARE THE VARIANTS THAT HAVE BEEN REPORTED. THERE ARE MANY. THERE'S ONE OARKS KAY ONE HAS MULTIPLE. THE -- IS INVOLVED IN CANCER THAN PRIMARY RESPIRATORY CHAIN PHENOTYPES. MOST DISEASES ARE PRIVATE SO THEY'RE IN A FEW FAMILIES. AND MOST GENES HAVEN'T EVEN BEEN IDENTIFIED YET. SO THERE'S MORE THAN 1100 DIFFERENT PROTEINS IN THE MITOCHONDRA. THEY ARE GOOD -- ALL OF THE KNOWN DISEASE GENES MAKE PRODUCT, THAT ARE LOCALIZED IN THE MITOCHONDRA. SO IT'S VERY LIKELY THAT MORE OF THESE PROTEINS ARE GOING MIGHT CON-- IF YOU PERFORM NEXT -- SEQUENCING HOPEFULLY BOTH GENOMES AT ONE. WHAT DO YOU DO IF YOU CAME TO MY CLINIC TOMORROW. IF YOU COME WE COULD TOTALLY SEND THE KNOWN MITOCHONDRIAL DISEASE GENES. SO THIS COSTS ABOUT LIST PRICE $7,000 AND WE CLD TEST 100 MILLION GENES. BY THE PUBLICATION OF THE -- ABOUT A QUARTER OF CASES THAT HAVEN'T BEEN SOLVED BEFORE WERE SOLVED JUST BY LOOKING AT THE KNOWN GENES. IN THE PAST YOU COULDN'T TEST FOR THE KNOWN GENES. YOU COULD TEST FOR PAUL GAMMA AND SOME OF THE COMMON. IF WE WERE TO SEQUENCE ALL OF THE MITOCHONDRIAL LOCALIZED GENES WHICH REALLY ISN'T SOMETHING THAT'S CLINICALLY AVAILABLE YET BUT IF WE COULD, THEN AGAIN PROBABLY A 25% DIAGNOSTIC RATE OF THE UNSOLVED CASES AND MAYBE A 50% OF ALL CASES. THE EASY ONES AND THE HARD. MUCH BETTER IN AN A GENE BY GENE APPROACH. BUT IF YOU WERE TO SEQUENCE THE WHOLE WHAT COULD YOU DO. IT'S NOT THE MOST LIKELY CANDIDATE BUT ALL THE CANDIDATES. SO BAYLOR RELEASED SOME OF THE EARLY DATA. THE FIRST 34 CASES SINCE LAST YEAR THAT THEY HAD SEQUENCED. THEY IDENTIFIED A CLEAR GENETIC ETIOLOGY AND 38% OF THEM. OF COURSE THIS APPROACH IS ALSO COMPLEX. YOU HAVE A LOT OF DATA AND THE WAY THEY WERE STARTING AT LEAST IS THEY WERE ONLY EVALUATING PROBANS, NOT THEIR FAMILY MEMBERS, AND NOT STUDYING THE MITOCHONDRIAL DNA. WHERE WE KNOW A SIZEABLE PERCENT OF THE MUTATIONS ARE. THERE WAS DEFINITELY ROOM FOR IMPROVEMENT AND THERE'S DEFINITELY BENEFITS, LIMITATIONS TO EACH OF THESE APPROACHES AND AREAS FOR RESEARCH AS WELL. FOR THE MOMENT THIS SUMMARIZES THE PART. IF YOU SUSPECT MITOCHONDRIAL DISEASE YOU DO A GOOD CLINICAL HISTORY AND EXAMINATION, PEDIGREE OF COURSE. AND BEFORE YOU CAN INTERPRET NEXT SEQUENCING YOU NEED TO KNOW THE MODE OF INHERITANCE, HOW IT'S MOTION LIKELY, THE METABOLIC STREAM LAB AND TISSUE LAB. OF COURSE I URGE YOU TO LOOK AT THE ENTIRE MITOCHONDRIAL DNA, NOT JUST POINT MUTATIONS BUT OFTEN FOR DELETIONS AND HOW MUCH MITOCHONDRIAL DNA YOU MIGHT HAVE. AND THEN COPY NUMBER ALTERATIONS, REALLY VERY STANDARD ASSAY IN CLINICAL GENETICS THAT HAS REPLACED CARE OH TYPE IN MANY INSTANCES. AND NEXT GEN SEQUENCING. WHY ARE WE HERE TODAY. NOW YOU HAVE THIS INCREDIBLY HETEROGENEOUS. WHAT SHOULD YOU BE COLLECTING FROM THEM TO STUDY AND HOW SHOULD YOU BE DOING. I THOUGHT I WOULD START BY TALKING ABOUT GEN THE LOCAL PI NOT JUST SCENE IN CLINICS BUT WANTS TO MAKE THE TRANSITION AND STUDY THEIR SAMPLES. WHAT ARE THE CHALLENGES AND WHAT ARE THE BEN GETS. IT'S VERY TIME INTENSIVE. NO DEAL IS TOO SMALL. OF COURSE IT TAKES A LOT OF ENERGY. SO IN OUR INSTANCE WE HAVE A GENETIC COUNSELOR WHO BOTH SEES THE PATIENTS IN CLINIC WITH ME. IT'S ABOUT A TWO HOUR TO THREE HOUR PREPARATION OF HER TIME TO PREPARE FOR EACH PRESENTATION WE SEE. AND THEN WE SPEND ABOUT TWO HOURS IN CLINIC WITH EACH PATIENT AND IT'S ABOUT AT LEAST AN HOUR OF ARRANGING TESTING AFTERWARDS LET ALONE FOLLOW UP. THAT'S BEFORE YOU'VE ENTERED THEIR INFORMATION INTO ANY STUDY BASE. THAT'S THE CLINICAL LETTER WHICH IS OFTEN BETWEEN FIVE AND SIX PAGES LONG TO EXPLAIN IT TO THEM WHAT THEY MIGHT HAVE. THAT'S EVEN IF YOU KNOW IT. AND THEN YOU NEED TO GET AN IRB OF COURSE TO ALLOW YOU TO COLLECT THEIR TISSUES AND DECIDE WHAT YOU'RE GOING TO DO WITH THEM AND OF COURSE THAT MEANS IRB ONGOING APPROVAL OVERSIGHT AND AUDITS AND KEEPING UP AND MAINTAINING ALL THE PROPER FORMS. YOU NEED TO HAVE A DATABASE AND MAINTAIN IT AND BE ABLE TO MINE IT. THIS I THINK IS NOT TRIVIAL IN ANY WAY. I THINK SETTING THIS UP RIGHT FROM THE START TO MAKE YOUR LIFE INFINITELY BETTER. WE HAVE ALWAYS WORKED IN EXCEL SO I TRUST EXCEL IT'S VERY CONCRETE BUT IT HAS ARE CERTAIN LIMITATIONS TO BE ABLE TO MINE IT AND SHARE IT AND UPDATE IT OVER TIME. WE'RE SWITCHING OVER TO RED CAP WHICH MANY PEOPLE ARE FAMILIAR WITH. THERE ARE CERTAINLY OTHER DATABASES ABLE AND WE SHOULD PROBABLY BE INTEGRATING WITH -- REPLICATING ALL THE SAME ENERGY AND PERHAPS DOING THE SAME STUDIES. IF THEY'RE ENROLLING WITH MY COLLEAGUES IN BOSTON OR CALIFORNIA THEY'RE NOT REPEATING THE SAME THING AS WELL. OF COURSE THERE IS LOCAL ACTIVATION OF A TISSUE SAMPLE STACK THE CELL LINE DERIVE MATERIALS WHICH IS NOT -- WE HAVE THE EXACT PROTOCOL ABOUT HOW ALL THESE SPECIMENS WILL BE COLLECTED, IS IT CLINICAL, IS IT RESEARCH. AND WHO IN THE LABORATORY IS GOING TO DO THESE THINGS THAT'S TECHNICIAN TRAINING AND TIME. AND THEN DOING IT CORRECTLY AND WHAT IS THIS CORRECT WAY TO GET DNA OUT OF THESE DIFFERENT SAMPLES OR RNA OR PROTEINS AND HOW SHOULD THE SAMPLES BE MAINTAINED. FOR EXAMPLE, IF YOU HAVE MITOCHONDRIAL GENE MUTATION AND YOU DON'T PREPARE THE CELL LINE CORRECTLY YOU COULD ABSOLUTELY LOSE IT AND ONLY MAINTAIN THE NORMAL CELLS WITH THE NORMAL MITOCHONDRIAL DNA AND YOU LOST YOUR MUTATIONS. VERY COMMON AND IT MIGHT BE A REASON YOU MAY NOT WANT TO GET THAT CELL LINE WHERE THEY WEREN'T AWARE OF THOSE ISSUES. BUT OF COURSE IT'S HYPOTENSION PAYOFF. IF YOU KNOW YOUR PATIENTS REALLY WELL YOU KNOW BEYOND WHAT'S IN THE CLINICAL RECORD, RIGHT. THE CLINICAL RECORD OFTEN HAS WHAT'S KNOWN AND IF IT'S AN UNKNOWN DIAGNOSIS, RIGHT, YOU MIGHT KNOW WHAT ELSE YOU THINK MIGHT BE GOING ON. AND YOU KNOW ALL THE DIFFERENT TESTING THAT WAS DONE. AND YOU ALSO HAVE RICH MATERIAL HOPEFULLY I'LL SHOW YOU IN A FEW MOMENTS ABOUT WHAT YOU COULD TEST FOR. SO ADDITIONAL EVALUATIONS FOR ETIOLOGY. THE BEST OF HANDS EXCELLENCE SEQUENCING RIGHT NOW IDENTIFYING 50 TO 60% OF CASES. WHAT DO THE OTHER PEOPLE HAVE. THE PATHOPHYSIOLOGY, OKAY. SO NOW YOU HAVE YOUR GENES, WHAT ARE YOU GOING TO DO ABOUT IT. THE FIRST THING YOU NEED TO KNOW LET'S SAY IT'S AN UNKNOWN GENE, WHAT IS IT DOING. WHAT IS IT DISRUPTING AND WHAT THERAPIES MIGHT HELP IT. BEFORE YOU MOVE TO THE CLINICAL TRIAL IT WOULD BE HELPFUL TO KNOW AND BE ABLE TO CORRECT SOME PHENOTYPE IN VITRO. SO THIS IS A SUMMARY OF WHAT WE PUT TOGETHER SINCE 2008. WE STARTED OUR STUDY BECAUSE WE FOUND SOME DATA IN THE WORMS ABOUT SOME METABOLIC PROFILES THAT WE SAW IN RESPIRATORY DISEASE THAT WE WANTED TO SEE THAT WERE COMMON IN THE HUMANS. AT THAT POINT I WAS AN INVESTIGATOR I DIDN'T HAVE A LIBRARY TO TURN TO, A BIOSPECIMEN AND SO IT TOOK ABOUT TWO, THREE YEARS BEFORE I HAD ENOUGH SAMPLES TO BE ABLE TO DO THE INTENDED STUDY THAT WAS FUNDED BY THE NIH BUT IT TOOK TIME TO SET UP PROPERLY. AND SO RIGHT NOW AS OF I THINK THIS MORNING, WE HAVE A LITTLE OVER 220 DIFFERENT SUBJECTS ENROLLED IN OUR STUDY. ABOUT 180 OF THOSE ARE EITHER PATIENTS AND/OR THEIR FAMILY MEMBERS BECAUSE NOW TO UNDERSTAND A PATIENT'S GENETICS YOU NEED MORE THAN THEIR PEDIGREE, YOU NEED THEIR FAMILY GENETICS. SO YOU REALLY NEED TO MAKE THE WHOLE FAMILY PART OF YOUR STUDY. AND THEN YOU ALSO HAVE ABOUT 40 CONTROL TISSUES BECAUSE WHENEVER YOU DO ANYTHING IN VITRO YOU HAVE TO GET THE CONTROL AS WELL. AND YOU SEE AGE MATCH AND GENDER MATCH. OF THIS POPULATION I THINK 45 PEOPLE HAVE DEFINITE MITOCHONDRIAL DISEASE, ABOUT 55 PEOPLE HAVE PROBABLE AND THEN THERE'S OTHERS WITH OTHER DISEASES IN THERE AND WE HAVE ABOUT 50 FAMILY MEMBERS. SO I COUNTED JUST THE OTHER DAY WHAT'S IN MY LABORATORY. IF YOU WERE TO ASK ME, YOU HAVE DEFINITE MITOCHONDRIAL DNA MUTATIONS, DEFINITE NUCLEAR MUTATIONS, MALL GAMMA AND PB17 FROM REAL CHILDREN THAT WE'VE SEEN. OF WHOM ARE NOW DISEASED. OF WHOM WHO HAVE CONSENT THESE SAMPLES TO USE IN OTHER STUDIES BUT HOW WOULD YOU KNOW I HAD THEM UNLESS YOU KNEW TO ASK. AND WHAT DO WE HAVE. WE HAVE CELL LINES, FIBER GLASS CELL LINES AND MUSCLE -- WE WIND UP GETTING 30 DIFFERENT MUSCLE SPASM OF SAMPLES. THAT'S WHAT TOOK SO LONG BECAUSE MOST BABIES DON'T WANT TO GIVE YOU MUSCLE IT'S HARD TO GET AND IT'S PRECIOUS WHEN YOU DO GET IT AND HOW IT'S HANDLED AND HOW IS IT FROZEN. THERE'S A LOT OF DETAILS. THIS IS WHAT'S LEFT. OTHER TYPES OF DISEASES SO OTHER DISEASES DOWN HERE, PEOPLE WITH PROBABLY DISEASE WHERE IT HASN'T BEEN SOLVED YET. AND OF COURSE CONTROL. AND THEN WHAT IS EXTRACTED? WE'VE KEPT EVERYTHING OF COURSE. WE HAVE BLOOD DNA, BLOOD RNA -- JUST ATTRACTING IT. AND SO REALLY WHAT WE'RE DOING IS LEANING FORWARDS RED CAP SO WE CAN DO THIS IN A MEANINGFUL WAY. AGAIN THERE'S DEFINITELY ROOM FOR IMPROVEMENT. I SPOKE WITH OUR BIO-- COMMUNITY IN PREPARATION FOR THIS TALK, WHAT ARE THE INSTITUTIONAL EFFORTS, THIS IS ONE INVESTIGATIVE EXPERIENCE FOR ONE SUBSET OF RARE DISEASES. BE THEY HETEROGENEOUS. SO THEY'VE INSTALLED AND ARE WORKING ON A SAMPLE DATA TRACKING SYSTEM. I THINK SIMILAR SOME OF THE THINGS AVAILABLE THROUGH THE BIOSPECIMEN AT NCI AND RARE DISEASE GROUP. SO THEY'RE USING NAUTILUS LIMP SYSTEM I BELIEVE BY -- SCIENCE TO MANAGE NORMAL PI DATA THEY'RE -- YOU CAN GO ON-LINE THERE'S SOMETHING LIKE 50,000 USES OF RED CAP AND IT SPREAD LIKE A VIRUS ONCE PEOPLE SEE IT THEY THINK THIS MAKES IT, IT'S EASY TO USE, YOU CAN TAKE DATA OUT AND PUT DATA IN. FOR US THIS IS REALLY WHERE WE'RE TRYING TO PUT ALL OF OUR DATA. THERE'S OUR CLINICAL DATA SYSTEMS. CHONG IS NOW TOTALLY INDEPENDENT SO ALL OUR CLINICAL LETTERS AND CLINICAL VALUES IS IN -- IF YOU WERE LUCKY IN RARE DISEASES TO MOVE TO CLINICAL TRIAL, THERE ARE SOME IN CHOP -- ONE SUBGROUP OF MITOCHONDRIAL DISEASES AND FOR OTHER TYPES OF CLINICAL TRIAL THEY'RE USING THE ENCORE SYSTEM. AND ULTIMATELY I THINK THERE'S A LOT OF INTEREST IN THE HARVEST FOR DATA QUERY AND DATA EXPIRATION. SO I'M AWE -- AWARE OF -- DATA MINING AND ANOTHER ONE FOR NEXT GENERATION SEQUENCING. DATA MINING REALLY VISUAL FRONT FOR RESEARCHERS WHO CAN'T TYPE CODE. BUT REALLY THIS IS BY NO MEANS WHEN YOU WANT TO START YOUR OWN BIOSPECIMEN REPOSITORY, I DON'T THINK ANY OF THIS IS NECESSARILY STRAIGHT FORWARD AND I DON'T NECESSARILY THINK IT NEEDS TO BE RECREATED FROM SCRATCH. I THINK MAYBE WE COULD ALL GET TOGETHER AND ADAPT TO SIMILAR SYSTEMS. BUT THEN LET'S SAY WE WANT TO SHARE IT. THESE ARE THE ISSUES HOW I SEE THEM IN TERMS OF THE CENTRAL ISSUES IN MAKING THESE BIO REPOSITORIES. ONE OF THE BIG ISSUES IS TRUST. IT HAS TO BE THE DATA QUALITY. SO IS IT ACCURATE DATA, IS IT UPDATED DATA. IF I PUT THE DATA IN THREE YEARS AGO YOU KNOW I DIDN'T KNOW THE GENE RIGHT BECAUSE WE COULDN'T FIGURE IT OUT BACK THEN. HAVE I UPDATED IT. YOU CAN ALWAYS SPEND YOUR TIME STUDYING ONE THING WHERE I KNOW THE PHENOTYPE IS SOMETHING DIFFERENT. THAT'S HAPPENED TO ME WHERE SOMEBODY SAID THIS IS A -- SUFFICIENCY FROM TEN YEARS AGO AND I GO AND LOOK AND IT'S A TOTALLY DIFFERENT GENETIC DISORDER. HOW ACCURATE IS THE DATA. AND SOME OF THE ISSUES I MENTIONED TO YOU. IF YOU LET MUSCLE FALL IT'S NOT SO GOOD ANYMORE. DO I EVEN WANT IT AND HOW DO I KNOW IT'S STORED. HOW DO I KNOW ALL THE STEPS. IF IT'S IN MY OWN PATHOLOGY DEPARTMENT THE MOMENT IT'S BEEN COLLECTED UNTIL I TAKE IT, I MIGHT HAVE A LOT MORE CONFIDENCE. IF YOU TRAVEL FROM MULTIPLE SITES OVER TIME TO BE SHARED. AND THEN HOW GOOD IS THAT ONGOING RESOURCE GOING TO BE MAINTAINED. ES THE GOING TO BE FUNDED IN A SUSTAINABLE WAY, CAN I TRUST THE SAMPLES WILL BE THERE IN FIVE YEARS OR 20 YEARS OR SHOULD I KEEP A SAMPLE OF EVERYTHING ANYWAY JUST IN CASE. I CAN'T EVEN TRUST THESE SAMPLES OFTEN WHEN CHILDREN ARE DISEASED. AND AGAIN IT'S THE PHENOABILITY. AND THE INCENTIVE. I DON'T THINK THIS IS A SMALL PART OF THE STORY. SO WHY SHOULD A LOCAL PI ESPECIALLY MAYBE SOMEBODY WHO HAS BEEN DOING THIS FOR 20 OR 30 YEARS AND HAS 200 OR 500 OR A THOUSAND SAMPLES SHARE THEM WITH ME AS A NEW INVESTIGATOR OR SHARE THEM WITH YOU WHO HAS A NEW IDEA FOR A STUDY. THERE HAS TO BE SOME SORT OF INCENTIVE FOR WHY PEOPLE WHO HAVE DONE ALL THIS, SOMETIME OF THEIR OWN FUNDING, SOME USE THEIR OWN PERSONAL FUNDING. WHAT IS THE SUPPORT AND WHAT IS THE LOCAL PI SUPPORT FOR THE SAMPLE ACQUISITION. SO FOR EXAMPLE, IF YOU WANT TO ESTABLISH A CELL LINE THE COST OF THE PATIENT AT MY INTUITION WOULD BE $800 FOR ONE CELL LINE. IN AN INVESTIGATIONIVE RESEARCH BASIS IT MIGHT BE 130, RIGHT. THERE IS CERTAINLY A COST TO IT AND HOW DOES THE LOCAL PI DO THE, WITH FAMILY DONATIONS. IS THIS SOMETHING THAT CAN BE SHARED, IS IT NIH FUNDED AND THEN WHO IS GOING TO DO THE DATA ENTRY. LIKE I MENTIONED IF YOU HAVE A SIX PAGE LETTER WORTH OF INFORMATION AND WE'RE TRYING TO PUT IT INTO A COMMON FORMAT, I HAVE ALL THE DATA BUT DO I THEN HAVE TO SIT DOWN AND FIND TIME TO PUT IT IN. PROBABLY IT'S ACCURATE IF I DID IT OR MY COUNSELOR DID IT, RIGHT. AND THERE MIGHT BE INFORMATION THAT WE KNOW THAT'S NOT IN THE LETTER BUT WHO HAS THE TIME TO DO THAT. DOESN'T MEAN IT SHOULDN'T BE DONE, I THINK IT'S A BIG ISSUE. AND THEN IS THIS, I THINK ONE OF THE INCENTIVE IS THIS IS A WIDELY AVAILABLE SOURCE SO IT'S NOT JUST MY SAMPLE IN THERE IT'S MAYBE OTHER PEOPLE'S ZAP PULLS I WANT TO USE. IS IT UNEQUAL TO BENEFIT FOR A NEW INVESTIGATOR I WANT TO STUDY MUSCLE SAMPLES MAY I HAVE YOUR MUSCLE SAMPLES AS OPPOSED TO SOMEBODY WHO HAS BEEN MAKING THE RELATIONSHIPS AND COLLECTING THEM ONE BY ONE OVER DECADES. ACCESS I THINK IS A BIG CONCERN. HOW WILL THIS DATA BE STORED. HOW WILL IT BE ACCESSED. WHAT ARE THE HURDLES THAT I HAVE TO GO THROUGH. WHAT ARE THE COSTS. IF I PUT SOMETHING IN TO GET IT BACK OUT. AND AGAIN, HOW WILL I BE SURE OF THE QUALITY AND OF THE SAMPLE ORIGINS, IT'S THE CORRECT SAMPLE. AND IF I PUT THE SAMPLE, SO AGAIN I THINK THE SAMPLE'S IN CLINICAL INFORMATION BUT IF I PUT THE SAMPLE IN AND SOMEBODY WERE TO USE THE SAMPLE AND FIND SOMETHING BRILLIANT FROM IT, IS THAT MY WORK OR DO I GRET NO CREDIT FOR THAT AT ALL. I THINK THAT'S ON EVERY PI'S MIND, WHAT HAPPENS TO THE SAMPLE CLEARLY THE PUBLISH IS THE KNOWN GENE IN THE CELL LINE. IT'S NOT THE ISSUE. I THINK THE ISSUE THIS IS A BRAND NEW OR MAYBE ONE CHARACTERIZED CONDITION AND I PUT THE SAMPLE OUT THERE. ARE YOU ALLOWING PEOPLE TO FORM COLLABORATIONS OR ARE YOU REQUIRING PEOPLE TO GIVE AWAY THEIR WORK. I THINK THAT'S ON THE PI'S MIND. IN THE DATA TRANSPARENCY IS THE VIEW. SO IF I PUT SAMPLES IN HOW DO I KNOW WHAT ELSE IS IN THERE. WHO CAN STUDY IT, HOW CAN I MINE THIS DATA. AND I KNOW AGAIN I WAS TALKING TO DR. RUBENSTEIN, THERE'S SOME VERY CLEAR ANSWERS THAT PROBABLY SHE COULD JUMP IN RIGHT NOW AND TELL US THESE HAVE BEEN SOLVED AND THAT'S GOOD. BUT I THINK THESE ARE WHAT ARE ON PEOPLE'S MIND. HOW EASY ARE THESE SYSTEMS TO MINE, IS IT AS EASY AS A GOOGLE SEARCH AND I CAN GET ALL MY INFORMATION OR ARE THERE CERTAIN FIELDS I WILL NEVER BE AWARE OF. SO THE MITOCHONDRIAL DISEASE COMMUNITY HAS COME UP WITH AN ANSWER THAT I'LL JUST MENTION BRIEFLY HERE. I'M CERTAINLY NOT IN CHARGE OF IT SO I'LL JUST MENTION SOME OF THE KEY FACTS. IT WAS ESTABLISHED I BELIEVE IN 2011 AND THERE'S NOW 14 SITES. I KNOW THAT BECAUSE I WAS THE 14TH SITE. IN THE U.S. AND CANADA AND THIS IS FUNDED THROUGH AN RO1 AT COLUMBIA DRZ -- WHO HAVE DONE A TERRIFIC JOB TOGETHER WITH THE NIC AND THE NINDS TO TRY TO BEGIN TO ADDRESS THESE PROBLEMS FOR THE MITOCHONDRIAL COMMUNITY. THE MAJOR GOAL ARE TO ESTABLISH A PATIENT REGISTRY WHICH I THINK EVERYBODY AGREES IS NECESSARY AND I THINK IT'S ACTUALLY A LITTLE BIT LESS TRAVELERS THAN A BIOSPECIMEN REPOSITORY. IF YOU WANT TO DO IT ON HIGH -- IF YOU NEED 60 PATIENTS THERE ARE ONLY 60 OR 70 PATIENTS ALIVE WORLDWIDE WITH HAD CONDITION THAT'S BEEN IDENTIFIED. YOU CAN'T DECIDE ON THIS ALONE. WHAT HAPPENS TO THE PEOPLE NORMALLY. SO I THINK THE MITOCHONDRIAL DISEASE COMMUNITY IS CLEARLY VERY MUCH IN SUPPORT THAT WE NEED TO KNOW WHAT HAPPENS TO A LARGE COHORT OF PEOPLE IDEALLY WITH THE SAME GENETIC TYPE OR SUBGROUP OF DISEASES SO THAT WE COULD BEGIN TO DO REAL CLINICAL TRIALS TO SEE IF THERAPIES AND INTERVENTIONS ACTUALLY WORK. I THINK THAT'S GOTTEN BIG BUY IN. ONE OF THE PLACES YOU -- IT TAKES A LOT OF TIME TO MAKE SURE THE PHENOTYPE IS CORRECTLY SHARED. SO WHOSE TIME IS THAT AND WHAT ARE THE FIELDS THAT YOU NEED. THAT'S LIKE A MAJOR BARRIER STILL TO ANY STUDY. AND THEN THE OTHER ISSUE IS THE DISEASE PEST MINUTE OF THE BIOREPOSITORY WHICH THIS MONTH I THINK THE IRB WAS APPROVED AND THERE'S NOW A BIO REPOSITORY ONGOING. MY UNDERSTANDING IS ANY TISSUE TYPE CAN BE APRIL INFECTED AND THEY WANT AT LEAST 500 PATIENT SPECIMENS OVER THREE YEARS. SO IN TERMS IS THAT BLOOD, I THINK A LOT IS STILL BEING WORKED OUT BUT I THINK THIS IS A MAJOR STEP IN THE RIGHT DIRECTION TO TRY AND DIRECT SOME OF THE THING FOR ANY BIOSPECIMEN NOW I COULD PUT MY TISSUES IN A LOCATION WHERE OTHER PEOPLE SAY I'M NOT GOING TO BE PULLING THEM ALL DAY LONG. SO IN THE FEW MINUTES I GUESS THAT'S REMAINING, I'M JUST GOING TO GIVE YOU A FEW EXAMPLES OF HOW WE USE THESE BIOSPECIMEN ONCE WE GET THEM. WHY ARE THIS SO PRECIOUS AND WHAT CAN WE DO WITH THEM. IF YOU THINK ABOUT ETIOLOGY, I THINK THERE'S A LOT OF WORK TO DO IN TERMS OF JUST GENE IDENTIFICATION. THIS IS ONE EXAMPLE OF A FAMILY WE FOLLOWED WHERE WE PERFORMED WHOLE SEQUENCING I BELIEVE SUCCESSFULLY WHERE DIAGNOSIS BECOMES NOT A JUST ONE GENE AFTER THE NEXT UNENDING PARADE BUT BECOMES A COMPUTER GAME THAT CAN BE SOLVED IN A MATTER OF WEEKS. THIS IS A YOUNG GIRL THAT'S BEEN FOLLOWED BY MY SOEVER A COLLEAGUE IN EARLY INFANCY WITH TERRIBLE SYNDROME CHRONIC -- HER GENES ARE ALWAYS THREE OR FOUR OR FIVE AND NORMAL IS LESS THAN TWO. SHE'S A SEVERE MUSCLE CHANGE DEFICIENCY IN COMPLEAKSZ ONE AND HE THRA. SHE CAME TO ME ORIGINALLY BECAUSE SOMEBODY FOUND OUT SHE'S A PAUL GAMMA HETEROZYGOUS. SHE HAD NO SECOND MUTATION, IT REALLY DIDN'T FIT. WE LOOKED FOR ABNORMALITIES SHE HAD ONE AS A MATTER OF FACT MITOCHONDRIAL DNA WITH 18 DIFFERENT GENES TOOK FOUR OR FIVE YEARS TO DO ALL THAT BASED ON WHAT WAS CLINICALLY AVAILABLE. WE SEQUENCED THESE AND FOUND NOTHING. WE FOUND NO HISTORY HER PARENTS ARE PERFECTLY NORMAL. HOW DO WE SOLVE THIS. IN THE MOST IF YOU WANTED TO DO LINKAGES MULTIPLE FAMILIES, WHAT DISEASE IS THIS. WHAT WE DID IS WE TOOK HER EXOHMS AND WE GOT -- IN COLLABORATION WITH DR. ERIC PIERCE -- BOTH PREVIOUSLY AT CHOP. ABOUT OF THOSE 150,000 VARIANTS MOST OR SOMOR -- 20,000 WERE CODING. IF YOU ONLY LOOK AT THE ANONYMOUS VARIANTS YOU GOT TO 11,000. HOW DO YOU GET FROM 11,000 NON-VARIANTS TO HER DISEASE? I THINK IN OUR EXPERIENCE, WE ARE BIG BELIEVERS YOU CAN STUDY THE WHOLE FAMILY, BOTH THE PARENTS AND THE SIBLINGS AFFECTED OR UNAFFECTED AS WELL. WHERE IS SHE BI-PARENTALLY INHERITED. WHICH ONES WERE INVOLVED MAKING MITOCHONDRIAL PROTEINS. ONLY 18 VARIANTS AND EIGHT GENES. WHICH ONES ARE NOVEL BASED UPON FREQUENCIES. AN ALLELE HAS 50% FREQUENCY ON A NON-DISEASE ALLELE. RIGHT NOW WE GOT THE TWO GENES AND WHICH ONES ARE PATHOGENIC. WE GOT DOWN TO ONE. THAT'S EXCITING, RIGHT. ONE FAMILY, ONE GENE. SO OKAY HERE'S HOW NEXT GENERATION SEQUENCING DATA LOOKS IF YOU HAVEN'T BEEN OVERWHELMED YET. THERE WE GO. SO THIS IS THE NORMAL REFERENCE SEQUENCE ON TOP AND EVERY LINE IS AN INDEPENDENT LEAD AND THE ARROW IS THE VARIANTS IN QUESTION, IT SHOULD BE A BEGAN. SHE HAS -- IT WAS RED 63 TIMES, 29 WAS A MUTATION AND WE KNEW IMMEDIATELY SHE GOT THIS FROM HER DAD. HERE'S THE OTHER ALLELE. SHE HAS A C. THIS ONE HAPPENED TO BE RED 231 TIMES. 112 OF THEM WITH A MUTATION. WE SIGNED A SEQUENCE OF COURSE AND WERE ABLE TO SEE TWO PEAKS TO SUPPORT THE HETEROZYGOUS. NOW IS THIS THE DISEASED GENE. I DON'T THINK YOU CAN STOP THERE. THAT'S WHERE THE BEGINNING OF THE RESEARCH STARTS. SO IT SHOWS SEGREGATION ANALYSIS IN A SMALL FAMILY. THERE WAS NOBODY ELSE. AND SO WHAT ELSE DO WE HAVE TO DO TO PROVE THIS IS THE DISEASED GENE. SO THIS IS ONE CHILD IN THE CLINIC, YOU KNOW. SHE NEEDS TO FIND SOMEBODY OR YOURSELF TO STUDY IT. AND THIS HAS BEEN OVER THE LAST YEAR NOW THAT WE'VE BEEN DOING THIS FOR THIS ONE CHILD. AT THIS POINT CAN YOU TELL THE FAMILY CAN THEY ACT ON IT. CAN THEY DO PRENATAL GENETIC DIAGNOSTIC TESTING OR ME IMPLANTATION DIAGNOSTIC TESTING. SOME PEOPLE SAY NOT UNTIL YOU PUBLISH BUT YOU CAN'T PUBLISH UNTIL YOU'VE PROVEN. THIS CAN GO ON FOR YEARS. IF IT'S NOT ENZYME YOU CAN DO AN ENZYME ACTIVITY ASSAY. WE HAVE HAD THE SUCCESS WHERE WE FOUND GENES THAT DO THAT. LET'S SAY IT'S NOT. IS IT A MITOCHONDRIAL GENE. FITS LOCALIZED IN THE MITOCHONDRA LIKE THIS ONE IS WE'RE WORKING ON, ONE OF THE GENES LOCATED IN THE INNER MITOCHONDRIAL MEMBRANE, THAT'S GREAT. WHAT DO WE HAVE TO STUDY, DO WE HAVE TO TODAY CLEAR MITOCHONDRIAL FUNCTIONS, PUT IT BACK IN THE GENE AND SHOW WE RESCUED IT, THAT WOULD BE VERY CONVINCING. ARE HER CELLS ENOUGH DO WE NEED TO CREATE A MOUSE. DO WE NEED TO MIKE -- IF IT'S A MITOCHONDRIAL DNA VARIANCE. SOME ARE HARDER THAN OTHERS. THIS IS ONE CASE WE'RE DOING DOZENS OF THESE CASES AND FINDING TANTALIZING HITS THAT REQUIRE ALL THIS RESEARCH. I THINK YOU CAN SEE THOSE CELLS IN TISSUE ARE NOW INCREDIBLY IMPORTANT BOTH DIAGNOSE THESE CHILDREN, RIGHT IN A RESEARCH SETTING WHEN THE CLINICAL SETTING MAY NOT GET IT AND THEN TO FIGURE OUT THE CAUSE. SO IN THIS CHILD'S CELLS WE WORKED, WE HAVE THE -- THERE'S A LOT OF MITOCHONDRIAL RESEARCHERS. DOCTORS -- ARE IN OUR SCHOOL, THEY HAVE VAST EXPERIENCE WITH GENES INVOLVED IN THE INNER MITOCHONDRIAL MEMBRANE AND SO THERE'S THE IMPORT ASSAY SAYING IS THIS WORKING AND TAKE A -- IN A CONTROL CELL LINE AND A CHILD'S CELL LINE AND THEY BASICALLY SHOWED THEY BELIEVE THIS GENE'S INVOLVED IN MITOCHONDRIAL IMPORT OF PRODEANS INTO MITOCHONDRIAL NATURE AND TO SHOW IT WAS NORMAL IN THE CONTROL AND IT'S ABNORMAL THE IN THE CELLS. IS THAT ENOUGH. WE HAD TO MAKE MITOCHONDRIAL CELLS THAT TOOK MONTHS TO DO. CELLS DON'T HAVE A LOT OF MITOCHONDRIAL TO BE ABLE TO DO THIS. WE HAD TO BE ABLE TO SAY THAT FOR THIS GENE, SORRY, FOR THIS PROTEIN THAT WE TRIED TO IMPORT DIRECTLY INTO THE MITOCHONDRA, IT WAS LESS -- AGAIN MORE WORK FOR ONE CHILD. YOU CAN'T SEEN THIS IS THE CAUSE FOR MITOCHONDRA DISEASE. WELL DOES IT MATTER. FOR A LOT OF NON-RARE DISEASES -- EVERY ONE OF THESE MITOCHONDRA TISSUES COULD HAVE -- IS IT FAIR TO SAY THIS ISN'T A DISEASE UNLESS I FIND IT AGAIN. IT IS HIGHLY HERETO GENESIS. I THINK NEXT IN SEQUENCING IS REALLY A WONDERFUL OPPORTUNITY TO GET AT THE ETIOLOGY OF THESE -- TO BE ABLE TO SAY IN GIVEN COMMUNITY, IN THE MITOCHONDRIA COMMUNITY HOW FREQUENT IS A GIVEN ALLELE. MY VARIANCE HAVE TO GO THROUGH ALL MY COLLABORATORS FOR THE MITOCHONDRIA DNA OR CAN I LOOK AT A COMMON DATABASE AND SAY THIS VARIANT WAS SEEN NEVER OR THIS VARIANT WAS SEEN 50% OF THE TIME. SO WE COULD START TO INTERPRET AS A GROUP THE RELEVANCE OF EACH OF THESE MUTATIONS AND HOPEFULLY HELP OUR FAMILIES IN A REAL FASHION. AGAIN I STRONGLY FEEL EVERY PATIENT IS A TRANSLATION ALL PROJECT -- AGAIN, I THINK WHAT WE'RE DOING IS CHANGING THE WAY WE'RE THINKING OF THESE RARE DISEASES. I THINK WE ALWAYS THOUGHT OF MITOCHONDRIA DISEASES AS A MULTIORGAN DISEASES. TISSUE ABNORMALITIES SEE IF IT'S ENERGY DYSFUNCTION BUT WE'RE GETTING TO THE PRECISE GENETICS AND PRECISE MOLECULAR PATHWAYS. EXACTLY WHAT WE WE TARGET. WAY ABLE TO DEFINE A REAL WAY WITH MITOCHONDRIAL DISEASE. THIS IS ONE OF THE EFFORTS WE'RE DOING AND NOW THE BIOMARKERS FOR SUBSET OF DISEASES GIVEN TO FOLLOW THEM BY DEVICE THERAPIES FOR THE SPECIFIC PATHWAY INVOLVE NOT JUST FOR MITOCHONDRIAL IN GENERAL. AGAIN THERE ARE MANY DIFFERENT ASPECTS TO WHAT IT TAKES TO UNDERSTAND YOUR PATIENT. THERE'S THE CLASSIC MITOCHONDRIAL MORPHOLOGY AND FUNCTION, THE GENETICS AND OF COURSE THE CELL PATHOPHYSIOLOGY. AND THEN I GUESS TO END, IS THERE THERAPY. SO ONCE YOU'VE DIAGNOSED THESE DISEASES, WHAT CAN YOU DO. WE PUT TOGETHER A LIST A COUPLE YEARS BACK OF THE COMMON SUPPLEMENTS COENZYME Q, CREATINE, ARGININE. WHAT DOSES SHOULD YOU TAKE. NOBODY KNEW. THERE ARE DOSES OF THIS YOU CAN TAKE, THERE ARE RISKS TO SOME OF THEM. JUST BECAUSE THEY'RE A VISIT MINUTE THEY'RE AVAILABLE AT GNC. THAT DOES NOT NECESSARILY HEAN THEY -- MEAN THEY ARE EFFECTIVE OR HARMLESS. WHAT'S HAPPENING IN ALL RESPIRATORY CHAIN DISEASES FOR EXAMPLE. WHAT ARE THE SIGNALS THAT ARE AT VARIANCE THAT WE ARE TRYING TO CORRECT. SO IN OUR MODEL ONE OF THE MAJOR SIGNALS IS THE CHEMICALS THAT ARE FEEDING IN, THE REDUCING EQUIVALENTS TO RESPIRATORY CHANGE WE THINK ARE OUT OF BALANCE. THIS IS AFFECTING THE PATHWAYS AND WE THINK THAT'S HAVING A LOT OF SECONDARY EFFECTS ON THE WHOLE PERSON. THAT'S WHY YOU'RE SEEING THINGS LIKE FATTY LIVER OR LIVER FAILURE OR CERTAIN HEART PROBLEMS. WE WERE ACTUALLY ABLE TO SHOW IN THE HUMAN CELL LIANCES THAT I SHOWED YOU, THIS IS ONE CHILD CELL LINE WITH A SYNDROME OF A COMPLEX ONE DEFICIENCY. AND THE FAMILY HAS HEAD TREE NEUROPATHY BUT HAS A MUCH HIGHER LEVEL OF PERMUTATION -- AND HER CELLS ARE DEFICIENT BOTH IN ANY GENE AND -- IS ABNORMAL. SO WE THOUGHT CAN WE TREAT THIS WITH A PRECURSOR SO WE GAVE HER AN NAD PRECURSOR CALLED -- WHICH IS ESSENTIALLY VITAMIN B3 OR NIACIN AND WE REVERSED THIS. NOT ONLY THAT WE REVERSED SOME OF THE SIGNALING ABNORMALITIES AND WE RESTORED HER RESPIRATORY CAPACITY. SO IT WAS ABNORMAL AND WITH THE DRUGS WE WERE ABLE TO PROVE RESPIRATORY CAPACITY. WE KNOW THIS IS WELL BEYOND -- THERE'S MANY CELLULAR PROBLEMS THAT WE'VE SEEN FROM WORMS AND MICE AND NOW IN HUMANS. THIS IS JUST TO SAY THAT WE'RE SEEING HUGE PATTERNS OF ALL THE MITOCHONDRIAL DISEASE PATIENTS. THESE ARE JUST IN MUSCLE AND/OR CELL LINE TISSUES FROM THE VERY SAMPLES I SHOWED YOU. THERE'S HUGE CONSISTENCY, THERE IS A COMMON RESPONSE TO WHAT'S HAPPENING. AND THERE'S CERTAIN GENES THAT ARE LEADING US TO WHAT PATHWAYS ARE INVOLVED FOR EXAMPLE DR. -- WORK HAS SHOWN THAT THE RIBOSOMES ARE DIFFERENCELY REGULATED IN MITOCHONDRIAL DISEASE AND CYTOSOL IN MITOCHONDRIAL CYTOSOLS ARE VERY DIFFERENT. WHAT ARE THE SIGNALING PATHWAYS AND WE'RE NOT ABLE TO PROFILE. THAT IS THE SAME CHILD WITH COMPLEX ONE DEFICIENT COPY WE CAN PROFILE SOME OF THESE CENTRALLING PATHWAYS AND FIGURE OUT EXACTLY WHAT'S WRONG WITH HER. AND MAYBE THEY'LL COME UP WITH THERAPIES THAT WILL REVERSE THE PROBLEMS IN HER BODY, IN HER CELLS. SO AGAIN MY VERY LAST SLIDE RESPIRATORY CHANGE FUNCTION IS ABSOLUTELY CHANGING METABOLISM GLOBAL SIGNALING THAT WE CAN CHARACTERIZE THESE CHANGES AND HOPEFULLY THERAPEUTICALLY CHANGING THEM. OF COURSE NO PERSON IS AN ISLAND JUST TAKING THE WORK OF MANY PEOPLE IN MY RESEARCH LABORATORY BOTH PRESENT AND PRIOR. IT'S AN EXCELLENT COLLABORATOR. -- AND HIS COLLABORATORS AS WELL WHO HELP WITH THE NEXT GENERATION SEQUENCING, COLLABORATORS AT THE UNIVERSITY OF PENNSYLVANIA INCLUDING DR. -- WHO HAS DEVELOPED NAD ANALYSIS AND COLLABORATORS IN CALIFORNIA WHO REALLY WERE CRITICAL TO HELPING US GET THE TISSUES WE NEEDED TO DO THE STUDIES THAT I SHOWED YOU AND OF COURSE THE FUNDING. THAT'S IT. [APPLAUSE] >> IF YOU HAVE ANY QUESTIONS FOR DR. FALK, AND YOU CAN STAY AFTER THE PRESENTATION TO TALK. >> IT'S MY PLEASURE TO INTRODUCE DR. RESNICK WHO WAS AT THE UNIVERSITY OF MARYLAND. HE HAS JOINED APPOINTMENTS IN THE DEPARTMENT OF LINGUISTICS, AND THE INSTITUTE FOR ADVANCED COMPUTER STUDIES. HIS RESEARCH FOCUSES ON COMBINING KNOWLEDGE-BASED AND STATISTICAL METHODS FOR NATURAL LANGUAGE PROCESSING WITH AN EMPHASIS ON MULTILINGUAL ISSUES AND COMPUTATIONAL SOCIAL SCIENCE. AND I WANTED TO KEEP THIS BRIEF FOR TIME BUT HIS TITLE TODAY IS UNLOCKING THE UNSTRUCTURED INFORMATION AND ELECTRONIC HEALTH RECORDS. THE ROLL OF NATURAL LANGUAGE PROCESSING TECHNOLOGY INTEGRATING PATIENT CLINICAL INFORMATION. DR. RESNICK. >> THANK YOU SO MUCH. AND THANK YOU VERY MUCH FOR THE INVITATION TO BE HERE. AS YOU MIGHT HAVE GOTTEN FROM THE BIO, I'M A LITTLE BIT OF AN OUTLIER THE DR. UP THERE IS NOT THE DOCTOR DOCTOR. I'M A COMPUTER SCIENTIST PRIMARILY, AND I WORK WITH ISSUES HAVING TO DO WITH HUMAN LANGUAGE. OVER THE LAST 12 OR 13 YEARS, I'VE BEEN DOING WORK SOMETIMES BUT LARGELY OUTSIDE THE ACADEMIC RESEARCH CONTEXT ON AUTOMATIC PROCESSING. THE LANGUAGE IN CLINICAL RECORDS. SO TO BEGIN, AN EXAMPLE. IN 1996, THE AMERICAN ACADEMY OF PEDIATRICS INTRODUCED SOME GUIDELINES THEY BASICALLY SAID IF A YOUNG CHILD COMES IN WITH A -- SEIZURE THAT THAT CHILD SHOULD GET A LUMBAR PUNCTURE IN ORDER TO CHECK FOR MENINGITIS, BACTERIAL MENINGITIS. AS YOU PROBABLY KNOW IT'S NOT A PLEASANT EXPERIENCE. AND BUT THIS WAS SOMETHING THAT WAS CONSIDERED NECESSARY BECAUSE THAT MIGHT NOT MANIFEST IN ANY OTHER WAY. IN 2009, THE -- COLLEAGUES REVIEWED, THEY DID A RETROSPECTIVE REVIEW OF MEDICAL RECORDS. THEY LOOKED AT 704 CHILDREN THAT HAD COME INTO THE EMERGENCY ROOM THAT MET THE CRITERIA, 271 OF WHICH HAD HAD LUMBAR PUNCTURE. AND OUT OF THOSE 704 ALTOGETHER, THE NUMBER OF CASES WHERE THERE ACTUALLY WAS MENINGITIS WAS ZERO. WHY AM I STARTING WITH THIS? BECAUSE IN ORDER TO FIND THOSE CASES -- I SHOULD ADD BY THE WAY, THE AMERICAN ACADEMY OF PEDIATRICS PROVIDES THIS RECOMMENDATION NOT TOO LONG THEREAFTER. SO WHY AM I STARTING WITH THIS? BECAUSE IN ORDER TO FIND THESE CASES, THEY COULDN'T GO LOOK FOR AN ICB CODE. THEY COULDN'T GO LOOKING IN THE FIELD OF DATA. TO FIND CASES WHERE A CHILD WAS SHOWING UP WITH FIRST FEBRILE SEIZURE THEY HAD TO LOOK IN THE LANGUAGE OF THE CLINICAL RECORDS BECAUSE THERE'S A LOT OF INFORMATION LOCKED UP THERE IN THE LANGUAGE OF THE CLINICAL RECORDS. SO THE MAIN SUBJECT THAT I'M GOING TO TACKLE TODAY IS THE QUESTION WITH THE WIDE SPREAD ADOPTION OF ELECTRONIC MEDICAL RECORDS, WHAT HAPPENS TO THAT LANGUAGE. AND WHAT IS THE IMPACT BOTH FOR CLINICAL CARE AND IN THE RESEARCH CONTEXT. SO WHAT'S HAPPENING IS A SHIFT FROM A TRADITIONAL MODEL WHICH AS YOU CAN SEE HERE IS ONE WHERE CLINICIANS BY AND LARGE ARE USING A LOT OF CLINICAL LANGUAGE OFTEN IN -- AND THAT LANGUAGE IS TRANSFORMED TO A GREAT, TENT AS PART -- EXTENT AS A REVENUE CYCLE BECAUSE YOU NEED CODES TO GET PAID. THEY'RE TRANSFORMING THESE THINGS INTO STANDARDIZED STRUCTURE FOR EXAMPLE IN -- CODES. THAT'S INTERESTING. DID I DO THAT? THANK YOU. SO AND AS A RESULT OF THESE CODES, RIGHT, THIS IS HOW PAYMENT IS DONE. THIS IS A VERY TRADITIONAL MODEL. WHAT WE'RE SEEING IS A SHIFT WHERE THERE IS A DESIRED DOWN STREAM AMONG PATIENTS AND RESEARCHERS AND CLINICIANS AND POLICY MAKERS FOR STRUCTURED DATA IN ORDER TO DO ALL SORTS OF IMPORTANT THINGS USING AGGREGATE ANALYSIS SOME OF WHICH WAS MENTIONED IN THE PREVIOUS STALK WHAT YOU GET OUT OF THE POWER OF LOOKING AT LARGER QUANTITY OF DATA. THERE IS A PUSH TO MOVE THE DATA ENTRY UP STREAM, RIGHT. ELECTRONIC RECORDS ARE STARTING TO TAKE A DIFFERENT FORM, AND 9 REWARDS, THE DOLLAR SIGNS COMING IN THE OFFICE OF DIRECTION ARE NOT JUST ABOUT REIMBURSEMENT, THEY ARE MEANINGFUL USE INCENTIVES. TO MOVE TO A DIFFERENT WAY OF GETTING INFORMATION INTO THE ELECTRONIC RECORD, A WAY THAT LOOKS MORE LIKE THIS. THIS IS A SNAPSHOT FROM SOME YEARS BACK OF AN ELECTRONIC MEDICAL RECORD IN THE PRODUCT REVIEW OF THIS ONE. THE SYSTEM IS IN THE POSITIONS TO POINT AND CLICK THEIR WAY INTO AN ENTIRE EXAM QUICKLY AND EFFORTLESSLY. TYPICALLY WITH THEIR HEAD DOWN HERE INSTEAD OF FACING UP TOWARDS THE PATIENT. RIGHT. SO THERE IS A MOVE. NOW, THIS HAS ENORMOUS VALUE. THERE'S A LOT OF INFORMATION. I MEAN IT'S CRAZY TO DICTATE BLOOD PRESSURE. ALL RIGHT. I MEAN, IT'S CRAZY TO RELY ON THE FULL RICHNESS OF HUMAN LANGUAGE IN ORDER TO EXPRESS THE IDEA THAT THIS PILL SHOULD BE TAKEN ONE TIME PER DAY ORALLY AND BY THE WAY, THERE ARE, I HAD SOMEBODY THAT I KNOW COMPILED A LIST AT LEAST A THOUSAND DIFFERENT WAYS IN THE SAMPLE THAT WE LOOKED AT OF EXPRESSING, TAKE THIS PILL ONE TIME PER DAY ORALLY. RIGHT. BECAUSE IT'S A COMBINTORIC ISSUE. YOU'VE GOT A LOT OF POSSIBILITIES. THERE ARE A LOT OF REASONS WHY WE WANT STRUCTURED DATA SOME OF WHICH SHOULD IN FACT BE TAKING PLACE UP STREAM WHEN PEOPLE ARE PUTTING THINGS INTO THE RECORD. BUT THERE'S AN IMPORTANT QUESTION TO ASK WHICH IS WHAT IS LOST. SO I STOLE SOME VERY PROVOCATIVE SENTENCES FROM AN EDITORIAL FROM AN MD IN PENNSYLVANIA LAST YEAR WHO POINTS OUT THAT IN YEARS PAST A WELL WRITTEN HISTORY AND PHYSICAL OR PROGRESS NOTE WOULD UNFOLD LIKE A STORY. THERE IS THIS NOTION. I'M NOT A CLINICIAN BUT I'VE SPOKEN TO A NUMBER OF THEM AND THERE IS SORT OF UNIVERSALLY THIS NOTION THAT THERE IS IN FACT A STORY, A NARRATIVE THAT IS PART OF WHAT IS GOING ON WITH THE PATIENT. I LOVE THE EXPRESSION THE PREVIOUS TALK HIGH LOCAL KNOWLEDGE OF SUBJECT PHENOTYPES. WHICH AT LEAST IN PART HAS TO MEAN YOU KNOW WHAT'S GOING ON WITH YOUR PATIENT. RIGHT? THAT'S PART OF WHAT THESE RECORDS HAVE ALWAYS BEEN ABOUT. SO AS AN EXAMPLE, RIGHT. HERE IS AN ANONYMIZED RECORD OF SOMEBODY WHO CAME IN WITH SHORTNESS OF BREATH, AND THE STUFF IN BLUE HERE IS AN EXAMPLE OF THE KINDS OF THINGS THAT YOU WILL TYPICALLY SEE IN THESE STRUCTURED AND KINDS OF MEDICAL RECORDS. YOU HAVE THE SHORTNESS OF BREATH AT THE TOP. YOU HAVE THE AGE, YOU HAVE THE SEX OF THE PATIENT, YOU HAVE 9 THE FACT SHE WAS ADMITTED. YOU HAVE IN THE MIDDLE THERE POSITIVE TEST RESULTS AND SO FORTH. HERE IN RED ARE SOME OF THE THINGS THAT DON'T FIT QUITE SO NEATLY INTO CHECK BOXES IN TRYING TO PUT THESE THINGS INTO THE RECORD IN THOSE FORMS. SO PROGRESSIVELY GETTING WORSE AND IT COMES ON WITH ANY KIND OF ACTIVITY WHATSOEVER. A LOT OF ELABORATIONS. MILDLY RESPONDING TO PARTICULAR, I MEAN YES THE RECORD'S GOING TO CONTAIN HOW SHE'S BEING TREATED. BUT THE FACT THAT SHE'S RESPONDING AND/OR THE DEGREE OF THE RESPONSE, IT'S NOT NECESSARILY THE CASE THAT YOU'RE GOING TO FIND A PLACE FOR THAT WHEN YOU RULES THE WAY YOU'RE EXPRESSING WHAT'S GOING ON WITH THE PATIENT TO A SET OF CHECK BOXES, STRUCTURED FIELDS. I WOULD ADD, BY THE WAY, THAT MOST OF THESE FORMS OF PUTTING DATA IN DO IN FACT HAVE SOME KIND OF FALL BACK FOR ADDING LANGUAGE IN TEXT BOXES, GO AHEAD AND TYPE THE ELABORATION. THE PROBLEM AGAIN FROM SPEAKING TO CLINICIANS, THERE ARE A COUPLE PROBLEMS WITH IT. ONE IS THERE'S A LOT OF DISINCENTIVE TO USE THOSE, RIGHT. I MEAN IT TAKES EXTRA TIME TO GO AND DO THAT. A SECOND PROBLEM IS IT FRAGMENTS THE STORY. YOU HAVE THIS TEXT BALK HERE, YOU HAVE THAT TEXT BOX THERE. WHAT YOU WANTED WITH STRUCTURED DATA DOWN STREAM FOR THE IMPORTANT USES OF AGGREGATED DATA NOW WE HAVE UNRESTRICTED LANGUAGE AGAIN IN THE TEXT BOX. WHAT ARE WE GOING TO DO WITH THAT? SO PEOPLE IN THIS ROOM, AND THIS IS PROBABLY OLD HAT, BUT FOR ME IT'S A LITTLE BIT OF A REVELATION. I SPOKE TO A DOCTOR ABOUT THIS EXAMPLE. THIS IS THE SAME PATIENT LATER WHO AGAIN WAS THERE WITH SHORTNESS OF BREATH. AND HE POINTED OUT THAT THIS SENTENCE BLOOD PRESSURE WAS BROUGHT DOWN AGGRESSIVELY AND THIS COMBINED WITH -- REVERSE RESPIRATORY DISTRESS COMPLEX. -- REVERSED RESPIRATORY DISTRESS COMPLEX. THEY ARE TRYING TO FIGURE OUT IF IT WERE PNEUMONIA OR HEART FAILURE. IF IT WAS PNEUMONIA SHE WOULDN'T HAVE RESPONDED TO THAT. YOU HAVE TO KNOW WHAT'S GOING ON THERE OTHER THAN I GAVE THE TEST. SO THERE'S A CONCERN THAT I HAVE AND A NUMBER OF PEOPLE I'VE SPOKEN TO THAT THE MEDICAL RECORDS WE'RE MOVING TO STARTING TO MOVE TOO QUICKLY ARE PUSHING US IN A DIRECT SEAN THAT MIGHT IN FACT IMPROVE SOME THINGS, QUALITY OF CARE METRICS AND OTHER THINGS ARE DEFINITELY YOU SEE PEOPLE WHO STUDY THAT BUT THEY MIGHT BE DEGRADING THE QUALITY OF THE INFORMATION, WE MAY BE LOSING SOMETHING ALSO AND NOBODY'S REALLY LOOKING AT THAT. SO I DECIDED TO. A COUPLE YEARS BACK SOME COLLEAGUES AND I AT A COMPANY CALLED COWRITE WHICH IS DOWN THE STREET IN BETHESDA, WE DID A STUDY WHERE WE LOOKED AT, THIS IS A SMALL PILOT STUDY. 20 CARDIOLOGY COHORTS. AND LOOK IN AN IDEAL WORLD WHAT YOU'D LIKE TO DO TO FIGURE OUT IF SOMETHING'S GETTING LOST IS A HEAD TO HEAD COMPARISON WHERE YOU TAKE A CLINICAL ENCOUNTER, YOU HAVE THE CLINICIAN DOCUMENT THE ENCOUNTER, THEN YOU BOTTOM THEM BOTH ON THE HEAD, YOU GIVE THEM AMNESIA AND HAVE THEM DO THE ENCOUNTER AND THE DOCTOR USES STRUCTURED ENTRY ELECTRONIC MEDICAL RECORD AND YOU LOOK AND COMPARE THE INFORMATION THAT APPEARS IN BOTH OF THOSE. YOU CAN'T DO THAT OBVIOUSLY. SO WE DID AN APPROXIMATION. WHAT WE DID WAS WE TOOK THE RECORD, THE DICTATION. WE TOOK AN EXISTING SPECIFICATION TO AN ELECTRONIC MEDICAL RECORD. IN OTHER WORDS WE KNEW WHAT SLOTS THAT ARE THERE TO FILL AND WE HIGHLIGHTED TWO PEOPLE WHICH I WAS ONE WHO GOT FAMILIAR WITH THE RECORD, HIGHLIGHTED THE PLACES WHERE YES THIS PIECE OF INFORMATION HAS A HOME IN THIS ELECTRONIC RECORD. SO THOSE ARE THE THINGS IN BLUE. YES, THOSE THINGS COULD FIT IN IF ONE WERE USING PICK LISTS OR OTHER WAYS OF GETTING THIS INFORMATION INTO THE RECORD. THEN WE TOOK TWO CARDIOLOGY EXPERTS AND WE DIDN'T TELL THEM WHY WE WERE DOING THE STUDY. WHAT WE DID SAY WAS LOOK SOMEBODY REFERRED THIS PATIENT TO YOU FROM ACROSS THE COUNTRY. THE INFORMATION THEY GAVE YOU IS DISCUSSED IN BLUE OVER THERE. THIS SET OF INFORMATION. BUT EVERYTHING THEY KNEW ABOUT THE PATIENT IS THERE IN THE TEXT. I WANT YOU TO HIGHLIGHT THE STUFF THAT'S THERE IN THE DICTATION THAT WAS NOT INCLUDED THAT SHOULD HAVE BEEN. AND FOR ANYTHING YOU FIND THAT SHOULD HAVE BEEN, I WANT YOU TO RATE IT ON A ONE TO FIVE SCALE AS TO HOW IMPORTANT IT IS CLINICALLY WHERE ONE IS YES IT WOULD BE NICE TO HAVE AND FIVE IS WHOA. I CAN'T BELIEVE THAT THE REFERRING PHYSICIAN DID NOT TELL ME THIS, THIS IS REALLY SERIOUS. AND THEN IN ORDER TO BE CONSERVATIVE, WE ONLY CONSIDER THE PIECE OF INFORMATION MISSING IF BOTH CARDIOLOGY EXPERTS HAD MARKED IT INDEPENDENTLY AS MISSING AND THE OMISSION IS LOWER THAN THE TWO IF ONE MARKED IT AS SEVERITY OF FOUR OR TWO IT WOULD COUNT AS TWO -- AND THEN DO A DIRECT COMPARISON. WHAT DID WE FIND. REMEMBER IT'S 20 YEAR, THIS IS A SMALL N BUT IN 10 OUT OF 20, RIGHT, THERE WAS AT LEAST ONE PIECE OF CLINICAL INFORMATION OF A 4 OR 5 THAT WAS CONSIDERED TO HAVE BEEN MISSING BY BOTH CLINICIANS. SO THAT'S IN 10 OUT OF THE 20. THERE WAS SOME SERIOUSLY IMPORTANT PIECE OF CLINICAL INFORMATION. THERE WAS NO HOME IN THE STRUCTURED RECORD. NOW, ONE MIGHT SAY LOOK, MAYBE THE STRUCTURED RECORD WAS NOT DEFINED IDEALLY AND SOME THINGS ARE EASY TO FIX, RIGHT. SO DEGREES FOR SYMPTOMS WAS THE PAIN MILD OR SEVERE. IT'S NOT THAT HARD TO ADD AN EXTRA CHECK BOX. REACTION TO ALLERGIES. THERE'S A RELATIVELY NUMBER OF THOSE THINGS. YOU COULD IMAGINE LISTING THOSE THINGS OUT. WE WENT THROUGH THESE THINGS AFTER THE FACT, DID A POTION DOCK ANALYSIS. IDENTIFIED THOSE AND GIVE THE ELECTRONIC STRUCTURED RECORD CREDIT AS IF THOSE THINGS WERE ALREADY INCLUDED IN IT. BECAUSE THEY'RE EASILY FIXABLE AND WE WENT BACK AND SAID OKAY DID THAT FIX EVERYTHING AND THE ANSWER WAS NO. STILL IN 25%, AND BY THE WAY I HAVE TO SAY ONE OF THE CLINICIANS WAS RATHER MORE CONSERVATIVE THAN THE OTHER AND SO OUR CONSERVATIVE METRIC WAS IN FACT PUSHING THINGS DOWN. HE WOULD HAVE HAD A RATHER LARGER NUMBER. BUT STILL, AGAIN, NOT A LARGE N BUT AT THE VERY LEAST SUGGESTIVE OF THE IDEA THAT THERE ARE NON-REMEDIAL ADMISSIONS. THE USE OF NATURAL LANGUAGE TO STRUCTURED INPUT. WHAT KINDS OF THINGS ARE DIFFICULT TO REMEDIATE WHEN WE DID THE ANALYSIS OF WHAT WAS LEFT? WELL, NUANCE OR DETAILED ELABORATION. THE FACT THAT THE PATIENT WAS ABLE TO WALK ON FLAT LEVELS AT A MODERATE PACE FOR THIS LENGTH OF TIME. TEMPORAL OR LOGICAL THE TACHYCARDIA OCCUR, WHEN DID IT OCCUR, FAR REMOVED FROM THE ORIGINAL MI. I LIKED THE SECOND ONE THERE IN THE MIDDLE. THE DICTATING PHYSICIAN WAS -- THIS WAS A PILOT. AND ONE CLINICIAN WAS SAYING TO THE OTHER ESSENTIALLY, I'M NOT SURE THAT WE WANT TO SEE THIS GUY'S FAA CERTIFICATION RENEWED UNLESS HE HAD THIS OTHER TEST. IT'S VERY HARD TO IMAGINE THERE BEING ROOM FOR THAT IN THE KIND OF RECORD THE DOCTORS ARE BENDING OVER RIGHT NOW ON THEIR iPADS TRYING TO FIND THINGS. SO WHAT KINDS OF INFORMATION ARE MISSING? IT TURNS OUT THE THINGS THAT ARE DIFFERENT TO REMEDIATE ARE EXACTLY THE THINGS THAT MAKE A NARRATIVE A NARRATIVE. RIGHT. TEMPORAL CONTEXT. THE SEQUENCES OF EVENTS. THE THOUGHT PROCESS OF THE CLINICIAN, RIGHT. WHAT ACTUALLY WAS GOING ON WITH THE PATIENT? NOT JUST A BOILERPLATE CHECKLIST FROM A SET OF THINGS THAT THE DOCTOR PICKS FROM. SO THE POINT HERE, I WOULD ARGUE, IS THAT IF YOU LOSE THE LANGUAGE, YOU LOSE THE STORY. AND THE STORY IS IMPORTANT. NOW I JUST ARGUED THE STORY IS IMPORTANT IN A CLINICAL SENSE. I NOW WANT TO MAKE THE ARGUMENT THAT IT'S IMPORTANT FROM THE PERSPECTIVE OF MEDICAL RESEARCH AS WELL. THERE IS A, THERE'S A CYCLE. PEOPLE WITH INFORMATION MANAGEMENT SOMETIMES TALK ABOUT THE CHANGE FROM DATA TO INFORMATION TO KNOWLEDGE TO WISDOM. NOW I'M NOT GOING TO TRY TO SAY ANYTHING ABOUT WISDOM HERE. BUT THERE'S A CYCLE HERE WHERE YOU START WITH RAW DATA. AND THEN GIVEN WHAT YOU KNOW, YOU EXTRACT INFORMATION, YOU LABEL THINGS. THIS IS AN INSTANCE OF THIS THING WE ALREADY KNEW ABOUT. YOU HAVE CATEGORIES AND ASSIGN THEM TO THE RAW DATA. IT'S AN INTERPRETATION PROCESS. ONCE YOU GOT THAT YOU CAN MOVE FROM INFORMATION TO KNOWLEDGE BY DISCOVERING REGULAR RELATIONSHIPS AMONG THINGS, DR. DRUG TREATS THAT DISEASE. THESE TWO SYMPTOMS TEND TO CO-OCCUR, THESE TWO DRUGS VAN ADVERSE REACTION. WE IMPROVE OUR CLINICAL KNOWLEDGE, OUR MEDICAL KNOWLEDGE IN GENERAL AND WE SEE THAT BACK SO THE NEXT TIME WE LOOK AT DATA WE HAVE NEW CATEGORIES AND RELATIONSHIPS TO LOOK FOR. SO THERE'S A CYCLE HERE. WHAT HAPPENS IF WHAT YOU DO IS HAVE THE DOCTORS ENTER INFORMATION RATHER THAN DATA, RIGHT. YOU ONLY ENABLE THEM TO ENTER STUFF THAT YOU ALREADY KNEW WAS IMPORTANT. SO A COUPLE OF EXAMPLES. THIS FIRST FEBRILE SEIZURE IS AN EXAMPLE, RIGHT. THERE WAS NO WAY TO LOOK FOR THIS. SOMEBODY DECIDED IT'S IMPORTANT AND FOUND, IT WAS THERE IN THE LANGUAGE TO BE FOUND, RIGHT. -- CAPACITY IS PLACE WHEN 64 SLIDES WERE INTRODUCED IN THE EARLY 80 OR 90'S. APPARENTLY A PHENOMENA HAD NEVER SEEN BEFORE BUT STARTED TO SPIKE WHERE YOU STARTED TO SEE OPACITY IS NOT TRANSLUCENT BUT NOT OPAQUE. THEY WERE SORT OF IN THE MIDDLE. IT'S A DIFFERENT KIND OF READ, RIGHT. YOU WERE SEEING SOMETHING NEW. AND THE TERM GROUND GLASS CAPACITIES STARTED SHOWING BUT BUT NOT IN ANY MEDICAL TERMINOLOGY UNTIL 2001. IT TURNS OUT GROUND CLASS CAPACITIES ARE ASSOCIATED WITH LONG CANCER IF I UNDERSTAND WHAT I READ BUT IF I UNDERSTAND CORRECTLY THERE ARE SERIOUS IMPLICATIONS FOR THIS. THEY'RE A BETTER INDICATOR OF BLOOD CANCER THAN THE FULLY OPAQUE. THEY'RE IMPORTANT. AND YET RIGHT NOW WE CAN GO BACK TO 20 YEARS WORTH OF CLINICAL RECORDS AND WE CAN LOOK AT THOSE THINGS BECAUSE THE DOCTORS DICTATED GROUND GLASS CAPACITIES OR THIS STUFF LOOKS LIKE GROUND GLASS. IF WHAT THEY HAD BEEN DOING IS TICKING OFF IS THIS TRANSLUCENT OR OPAQUE, THEY NEVER WOULD HAVE SAID IT AND WE WOULDN'T HAVE IT THERE TO GO BACK TO. THERE WAS A STUDY WITHIN THE LAST COUPLE WEEKS, HEAD LAG, RIGHT. SO WHEN YOU'RE TESTING DOING DEVELOPMENTAL TESTING FOR AN INFANT AND THERE'S A PICTURE THERE, IT'S A GOOD MOTOR CONTROL OF A HEAD AND NECK AND THERE'S AT LEAST SOME SUGGESTIVE EVIDENCE THAT MIGHT BE AN EARLY INDICATOR OF A PHYSICAL SPECTRUM DISORDER. ACCORDING TO MY WIFE WHO IS A CHILD PSYCHOLOGIST AND DOES A LOT OF TESTING, THESE SORT OF EXAMS ARE DONE ALL THE TIME. HEAD LAG DOES NOT HAVE AN ICB CODE. IT HAS A SNOW MED CODE BUT THERE IS NO ICB CODE. SWALLOWED A MAGNET. SITUATION WHERE KIDS HAVE SWALLOWED MULTIPLE MAGNETS, THEY SORT OF PINCH THE DIGESTIVE TRACK. THERE ISN'T A SNOW MED CODE FOR THAT. JUST TRY LOOKING FOR A WORD MAGNET. YOU'RE GOING TO FIND MRI'S AND A LOT OF STUFF. IF YOU STICK TO STRUCTURED INPUT THE RELEVANT LANGUAGE THAT MAY BE USEFUL FOR RESEARCH PURPOSES NEVER GETS CREATED. SO I'VE JUST SET UP A DILEMMA. EVERYBODY AGREES THAT IN ORDER TO GET WHERE WE NEED TO AND THE PREVIOUS TALK WAS A WONDERFUL EXAMPLE, THE INFORMATION IS CRUCIAL. WE NEED TO AGGREGATE THE INFORMATION. WE NEED TO BRING STUFF TOGETHER AND SHARE IT AND STUDY IT AND IT NEEDS TO BE IN DISCRETE FORMS THAT WE CAN MINE AND MINE AND DO STATISTICS ON AND ANALYZE. THE KINDS OF ELECTRONIC MEDICAL RECORDS I WAS SHOWING ARE SLIGHTLY CARICATURING BUT A LOT OF MOVEMENT IN THAT DIRECTION ARE WIDELY VIEWED HOW FAR YOU GET THERE. BUT THE PROBLEM IS AS I TRIED TO ARGUE THE TYPICAL EMR'S ARE GOING TO MAKE YOUR JOBS A LOT HARDER BECAUSE THEY'RE GOING TO ELIMINATE OR FRAGMENT CRUCIAL LANGUAGE THAT MIGHT HAVE IMPORTANT POTENTIAL DOWN STREAM USE. THEY'RE GOING TO OMIT INFORMATION THAT CLINICIANS NEED IN ORDER TO COMMUNICATE WITH EACH OTHER EFFECTIVELY AND THEY'RE GOING TO POTENTIALLY LEAD TO THE NOT, THE LACK OF CREATION OF KNOWLEDGE THAT COULD BE USEFUL. SOMEBODY SETS UP AND TALK AND THEY'RE GOING TO SAME A DILEMMA. THEY'RE GOING TO PROPOSE A SOLUTION. HERE'S A KNIGHT RIDING IN TO SAVE THE DAY. LABELED NLP FOR NATURAL LANGUAGE PROCESSING. IF YOU DO GOOGLE SEARCHES YOU'LL FIND A LOT OF NEUROLINGUISTIC PROGRAMMING. NOT THAT. SO NATURAL LANGUAGE PROCESSING IS A FIELD THAT'S BEEN AROUND FOR SEVERAL DECADES. REALLY SENSITIVE TO THE ADVENT OF COMPUTERS. THEY STARTED LIKE IN THE 1940'S BECAUSE THEY WANTED TO BE ABLE TO TRANSLATE RUSSIAN JOURNAL ARTICLES INTO ENGLISH IN ORDER TO KEEP TABS ON WHAT WAS GOING ON. AND THE ENTIRE FIELD IS LARGELY BUILT AROUND THE IDEA OF TAKING UNSTRUCTURED LANGUAGE INFORMATION AND EXTRACTING STRUCTURE FROM IT AUTOMATICALLY. I WANTED TO SHOW YOU THE CASE OF THE KINDS OF THINGS YOU DO WHEN YOU'RE ANALYZING CLINICAL RECORDS JUST TO GIVE YOU A FEEL FOR IT. HERE'S A RECORD. ONE OF THE FIRST THINGS THAT YOU DO IS YOU DIVIDE IT INTO RELEVANT PIECES, RELEVANT REGIONS. FOR EXAMPLE, DISTINGUISHING THE HPI FROM THE MEDICAL HISTORY, SOCIAL HISTORY, FAMILY HISTORY. YOU'RE BREAKING THINGS OUT IN SECTIONS. SOMETIMES THAT'S EASY BECAUSE PEOPLE HAVE CONVENIENTLY LABELED THINGS, MEDICAL HISTORY AND FAMILY HISTORY AND CARDIAC AND SO FORTH. SOMETIMES IT'S HARDER THAT CARDIOLOGY, PROBABLY 20% OF THEM LOOK LIKE LONG LETTERS WITHOUT ANY SECTIONS. AND SO THERE'S A TASK TO BE DONE THERE TO BREAK THINGS UP INTO PIECES. I PULL OUT A COUPLE OF THESE FOR ILLUSTRATION PURPOSES. ANOTHER KIND OF BREAKING THINGS UP INTO PIECES IS YOU NEED TO ACTUALLY IDENTIFY PLACES WHERE THERE IS NEGATED OR EQUIVALENT LANGUAGE SO JUST BECAUSE REQUEST YOU'RE LOOKING FOR CHEST PAIN, IF A PATIENT DENIALS CHEST PAIN AND THAT'S ROARED IN THE RECORD, THIS IS NOT AN EXAMPLE OF A PATIENT WITH CHEST PAIN. IF YOU THOUGHT THERE WERE A LOT OF WANES TO TALK ABOUT A PILL BEING TAKEN ORALLY ONCE PER DAY YOU SHOULD SEE HOW MANY DIFFERENT WAYS THERE ARE TO SAY THAT THERE ISN'T SOMETHING. A LOT. ONCE YOU'VE IDENTIFIED THOSE, THE NEXT STEP, A NEXT STEP IS TO IDENTIFY RELEVANT DIAGNOSTIC LANGUAGE. I'M FOCUSING ON DIAGNOSIS HERE NOT PROCEDURES, NOT SOME OTHER THINGS JUST FOR PURPOSES OFFICE ILLUSTRATIONS WHICH IS THE STUFF I KNOW THE BEST. HERE IT'S NOT A QUESTION OF WORD SPOTTING OR PHRASE SPOTTING. YOU CAN PHRASE THE SAME THING IN LOTS AND LOTS OF DIFFERENT WAYS. THERE'S A FRACTURE IN THE LEFT, AND SO FORTH. AGAIN MANY DIFFERENT VARIATIONS YOU'RE NOT GOING TO PULL THEM FROM YOUR HEAD AND ENUMERATE THEM. IN ORDER TO GET THEM ALL THE BEST THING TO DO IS LOOK AT THEM AND LEARN ABOUT LOTS AND LOTS OF DATA WHICH IS ANOTHER TALK. SO ONCE YOU'VE GOT RELEVANT DIAGNOSTIC INFORMATION, THE NEXT STEP, WELL ANOTHER PIECE IS WHAT LINGUISTS CALL MORPHOLOGICAL ANALYSIS. SO IN THIS CASE WHAT WE CALL MORPHOLOGY HAS TO DO WITH THE WAY WORDS ARE BUILT. IT'S HOW WORDS ARE MADE. SO FOR EXAMPLE PAIN IS PAIN AS IT'S ROOT PLUS THE PLURAL. ENGLISH HAPPENS TO BE A LOT SIMPLER WITH OTHER LANGUAGES WITH RESPECT TO MORPHOLOGY. SOME LANGUAGES ARE QUITE A BIT MORE COMPLEX. SOMETIMES THIS MATTERS. SOMETIMES IT DOESN'T. SO IF YOU'RE DOING FOR EXAMPLE ICB9 CODING THE DISTINCTION BETWEEN THE SINGLE AND PLURAL FOR PAIN DOESN'T MATTER BUT FOR CYST OR CYSTS IT DOES. THEN YOU HAVE NEED TO COMBINE THE INFORMATION YOU DISCOVERED. SO FOR EXAMPLE THE FACT WHAT YOU HAVE HERE IS A 57-YEAR-OLD FEMALE, YOU HAVE THE AGE AND SEX AND THE CATEGORY THAT 57 YEAR OLD -- MODIFYING OR THE FEMALE DOWN THERE AT THE BOTTOM HAS BEEN HAVING CHEST PAINS SO THIS IS THE PERSON WHO IS HAVING THIS PROBLEM. IF YOU REMEMBER PARSING SENTENCES, IF YOU ARE OLD ENOUGH TO REMEMBER PARSING SENTENCES IN GRAMMAR SCHOOL BREAKING THEM UP INTO THEIR RELEVANT PIECES THIS IS THE SUBJECT THIS IS THE PREDICATE HERE'S THE NOUN AND THING MODIFYING IT THERE'S THE COMPUTATIONAL VERSION OF THAT DONE AUTOMATICALLY AND ON A LARGE SCALE. ONCE YOU GOT THOSE, EXTRACT STRUCTURED INFORMATION BASED ON THE RELATIONSHIP YOU'VE DISCOVERED LIKE THE PROBLEM HERE IS FIBRILLATION, WHAT KIND OF ATRIAL. AND SO FORTH. SO THERE'S A BUNCH OF INFORMATION EXPRESSED, AND THEN ONCE YOU'VE GOT THAT, THERE ARE A LOT OF DIFFERENT THINGS YOU CAN DO WITH IT. YOU CAN CONNECT IT TO A TERMINOLOGY, TERM LOGICAL SYSTEM OF SOME KIND, SNOW MED UMLS AND SO FORTH. YOU CAN MAP IT. VIA RULES OR STATISTICAL METHODS TO CODES AND CODING SYSTEMS. THIS SLIDE COMES OUT OF SORT OF REVENUE CYCLE KIND OF DISCUSSION WHERE THE GOAL IS TO GET TO CODES IN ORDER TO GET PAID. SO THAT'S THE EXAMPLES LOOK THE WAY THEY DO. AND THE LOGIC FOR EXAMPLE HERE MIGHT BE SAY LOOK WE HAVE EVIDENCE OF THESE THINGS BUT SOME ARE PERTINENT AND SOME ARE ONLY INCIDENTAL. SOME CODES WITH ONLY BE PRIMARY, SOME CAN ONLY BE LISTED AS NON-PRIMARY AND SO FORTH. SO THERE'S A LOT OF LOGIC AS WELL THAT GOES INTO FIGURING OUT WHAT TO REPORT FOR A DOWN STREAM APPLICATION. THE POINT HERE THOUGH THERE ARE LOTS OF DOWN STREAM APPLICATIONS. WHAT'S IMPORTANT IS EXTRACTING THE STRUCTURED INFORMATION FROM THE UNSTRUCTURED LANGUAGE. SO HERE JUST A COUPLE OF QUICK SCREEN SNAP SHOTS OF HOW THIS STUFF GETS USED IN A REVENUE CYCLE SETTING JUST TO GIVE YOU A FLAVOR. YOU GOT THE NOTE ON THE LEFT AND HERE ON THE RIGHT THE MACHINE IS ACTUALLY IDENTIFIED CPT CODE AND ICB CODE FROM THE NOTE YOU ACTUALLY SEE ICB9 AND ICB10 THERE AT THE RIGHT WHICH IS USEFUL FOR LEARNING EXPICCASSO FOR PEOPLE FIGURING OUT ICB10. AND THE, THIS IS PART OF THE WORK FLOW BECAUSE MACHINES ARE NOT ALWAYS PERFECT. SO SOMETIMES THE MACHINE IS, YOU HAVE CONFIDENCE METHODS WHERE YOU CAN SAY WITH STATISTICAL CONFIDENCE YES BUT THE MACHINE EXTRACTED THAT STRUCTURE IS CORRECT IF YOU DON'T PASS THOSE FRESH HOLDS THEN YOU NEED TO PASS THAT TO HUMAN BEINGS FOR REVIEW. THERE'S A CERTAIN AMOUNT OF HUMAN CURATION THAT NEEDS TO GO ON. THIS GOES, ONCE AGAIN TO THE PREVIOUS TALKER. THE DATA RELIABILITY ISSUES. IF YOU HAVE A MACHINE EXTRACTING SOME OF THE DATA, THEN YOU WANT A WORK FLOW IN THE LOOP WHERE SOME OF THE DATA IS ACTUALLY BEING CONFIRMED. ALTHOUGH SOMETIMES YOU CAN DO THAT POST DOC, RIGHT, WHERE YOU ACTUALLY FIND THE RECORDS THAT YOU CARE ABOUT AND THEN ON A SMALLER SAMPLE YOU DO THE WORK OF DOING THAT. THAT IN FACT IS WHAT THEY DID IN THE 2009 PAPER. YOU CAN DO THIS IN EVALUATION OF MANAGING SETTINGS WHERE THERE ARE A LOT OF DIFFERENT SETTINGS OF THINGS NOT JUST ICB CODES. YOU CAN ACTUALLY GENERATE THE STORY OF A CATHETERIZATION AUTOMATICALLY BY ANALYZING THE LANGUAGE SO ON THE RIGHT THAT'S AN AUTOMATICALLY GENERATED IMAGE OF THE PATH THAT WAS TAKEN BASED ON THE DICTATION AT THE LEFT. SO ONCE YOU'VE GOT THIS, THERE'S A WHOLE LOT OF AGGREGATE ANALYSIS THAT CAN BE DONE. EVEN WITHOUT GOING INTO A LOT OF DETAIL IN THE STRUCTURING OF THE INFORMATION. IDENTIFYING AND RECRUITING PATIENTS FOR VARIOUS PURPOSES. TRYING TO MANAGE CARE IN VARIOUS WAYS. LOOKING FOR COMPILATIONS OF VARIOUS KINDS. MANY DIFFERENT THINGS YOU CAN DO ONCE YOU HAVE THE DATA STRUCTURED THAT THIS LIST GOES ON AND ON AND ON. THESE ARE JUST SOME OF THE THINGS OF THE FOLKS I'M WORKING WITH ARE DOING NOW. SO I WANT TO ARGUE THERE'S A WAY FORWARD OUT OF THE DILEMMA. RIGHT. AT THE LOWER LEFT HERE YOU HAVE CLINICIANS AND I WOULD ARGUE THAT THERE IS A MIDDLE GROUND WE NEED TO GET TO, BETWEEN STRUCTURED INFORMATION AND ALLOWING CLINICIANS THE FREEDOM TO USE LANGUAGE IN A FREE AND FLEXIBLE WAY. AND THE WAY TO GET THERE IS NOT TO START WITH HIGHLY RESTRICTIVE STRUCTURED THINGS AND THROW IN LANGUAGE AS AN AFTERTHOUGHT. WE NEED A MAJOR RETHINK OR YOU, THE PEOPLE WHO ARE HOPEFULLY POTENTIALLY INTERESTED IN THIS TALK HERE ARE NOT GOING TO GET ALL OF THE INFORMATION THAT YOU WANT AND NEED, RIGHT. SO WE NEED TO LET THE CLINICIANS FOCUS ON THE CARE OF THE PATIENT AND THE UNIMPEDED IN THEIR COMMUNICATION. IT SHOULD BE STRUCTURED WHERE I MAKES SENSE AND WE SHOULDN'T BE OVER STRUCTURING WHERE IT DOES NOT. AND HOW DO YOU DO THAT? YOU USE THE EXISTING MEDICAL KNOWLEDGE TO GO INTO THE SYSTEMS THAT CAN DO THIS AND NATURAL LANGUAGE PROCESSING AND EXTRACT STRUCTURED INFORMATION, UNLOCKED STRUCTURE FROM THE UNSTRUCTURED INFORMATION. AND IN SO DOING, PRODUCE INFORMATION THAT LEADS TO KNOWLEDGE, THAT LEADS TO UPDATES OF HOW IT IS WE THEN ANALYZE BOTH RETROSPECTIVELY AND PROSPECTIVELY. SO YOU ACTUALLY TAKE ADVANTAGE OF THE ABILITY TO UNLOCK INFORMATION IN ORDER TIME PROVE OUR STATE OF KNOWLEDGE, IMPROVE OUR STATE OF CARE AND IMPROVE OUR ABILITY TO DISCOVER NEW THINGS. THANK YOU. [APPLAUSE] >> I WANT TO THANK YOU BECAUSE YOU JUST TOUCHED EXACTLY ON THE ISSUE THAT WE ARE FACING WITH GLOBAL DISEASE PATIENT TRYING TO DEVELOP STANDARDS AND COMMON ELEMENTS TO CAMMURE DATA IN THE SAME WAY AND THE SAME TIME NOT TO LOSE THE INFORMATION THAT'S COMING FROM THE PATIENT THAT HAS A LOT OF INFORMATION THERE THAT WE DON'T WANT TO LOSE. SO THANK YOU VERY MUCH FOR A WONDERFUL TALK. OUR NEXT SPEAKER, SHE'S THE LAST BUT NOT LEAST IS DR. RACHEL DVOSKIN. SHE'S A GENETIC ANALYSIS IN THE GENETICS AND PUBLIC POLICY CENTER WHICH IS PART OF THE INSTITUTE OF BIOETHICS IN JOHN HOPKINS UNIVERSITY. SHE HOLDS A PH.D. IN ANTHOLOGY FROM NEW YORK UNIVERSITY. SHE ACTUALLY DID HAVE DISSERTATION HERE AT THE N IT H SO SHE HAS A CONTRACT TO US. SHE WORKED FOR TWO YEARS AT COPY EDITOR -- STUDIED GENETICS IN SOCIAL RISK FACTORS AND HYPERTENSION AND SHE HAS -- SHE'S A MEMBER OF THE AMERICAN COLLEGE OF MEDICAL GENETICISTS AND WORKING AT JOHNS HOPKINS UNIVERSITY WHERE SHE DOES RESEARCH TO HELP DESIGN A NEW RESEARCH PROPOSAL AND CONTRIBUTES TO WRITING MATERIAL INCLUDING GRANTS -- MOST IMPORTANTLY SHE'S EXPEGGING HER SECOND CHILD. >> GOOD AFTERNOON. THANK YOU SO MUCH DR. RUBENSTEIN FOR THE DRUGGION AND FOR INVITING TO ME. IT'S AN HONOR TO BE PART OF THIS SEMINAR. ON THAT LAST POINT I DO HAVE ABOUT FIVE WEEKS LEFT SO I'LL TRY NOT TO HAVE ANY EMERGENCIES TODAY, I THINK WE'LL BE ALL RIGHT. SO AS YOU ALREADY KNOW, DRR A LOT OF CHALLENGES TO DOING RARE DISEASE RESEARCH AND WE HURT DR. FALK AND DR. RESNICK JUST INTRODUCED US TO, DID A FANTASTIC JOB OF INTRODUCING US TO A NUMBER OF THEM. I'M GOING TO TALK ABOUT SOME OF THE PRACTICAL, THERE ARE MANY PRACTICAL AND ETHICAL ISSUES THAT COME UP WITH ANY HUMAN SUBJECTS RESEARCH AND I'M GOING TO FOCUS ON, THESE CAN BE ESPECIALLY TRUE OR MAGNIFIED WHEN WE'RE TALKING ABOUT RARE DISEASE RESEARCH. AND WHEN WE'RE WORKING WITH VULNERABLE POPULATIONS SUCH AS CHILDREN OR ANY FAMILY PATIENTS OR FAMILIES WHO MAY BE DESPERATELY SEEKING ANSWERS TO DIAGNOSIS, ANSWERS, TREATMENTS, KIND OF ANY INFORMATION ABOUT THEIR EXPERIENCE AND SYMPTOMS. AND SO I'LL TALK ABOUT A FEW OF THESE CHALLENGES RELATED SPECIFICALLY TO RECRUITING AND CONSENTING PARTICIPANTS IN RESEARCH TO INFORMED CONSENT, SORRY, TO PRIVACY AND DATA SHARING. AND I'M GOING TO BE COMING AT THIS LARGELY FROM THE ANGLE OF HOW GENETIC AND GENOMIC DATA PLAY INTO THESE ISSUES BECAUSE THAT'S OBVIOUSLY GOING IN THE DIRECTION OF THIS KIND OF RESEARCH ESPECIALLY WITH THE WHOLE EXOME AND GENOME INDUSTRY AND THAT'S 9 SETTING IN THE PUBLIC POLICY GENDER. ONE OF THE OBVIOUS CHALLENGES IN DOING THIS RESEARCH IS SIMPLY COLLECTING AND RECRUITING ENOUGH AFFECTED PATIENTS OR PARTICIPANTS TO POWER MEANINGFUL CONCLUSIONS. I'LL TALK ABOUT SOME OF THE DIFFICULTIES WITH RECRUITING ENOUGH PARTS PUNTS AND COLLECTING THESE IMPORTANT AND SENSITIVE TYPES OF BIOSPECIMENS AND DATA. I'LL ALSO TALK ABOUT SOME OF THE NEWER CHALLENGES THAT HAVE COME UP WITH THE RAPIDLY ADVANCING GENOMIC AND INFORMATIONAL TECHNOLOGIES AND THE INCREASING NEED, THE CONSEQUENCE INCREASING NEED FOR MORE CONSISTENT STANDARDS AND GUIDELINES FOR RESEARCHERS TO TURN TO FOR DOING THIS TYPE OF, FOR PROTECTING HUMAN SUBJECTS AND FOR DOING THIS TYPE OF RESEARCH. I'LL ALSO EMPHASIZE THE IMPORTANCE OF INSURING THAT THE PEOPLE CONTRIBUTING ARE DONATING THESE SAMPLES AND DATA ARE WELL INFORMED AND ALSO THAT THEIR PRIVACY IS WELL PROTECTED. ONE ISSUE THAT'S COMMON IN RARE DISEASE RESEARCH IS THE CLOSE RELATIONSHIP THAT OFTEN, RELATIONSHIPS THAT OFTEN EXIST AMONG CLINICIANS AND INVESTIGATORS OFTEN THAT'S THE SAME PERSON, CLINICIAN AND INVESTIGATOR, PATIENTS AND FAMILIES. AND THESE RELATIONSHIPS CAN BLUR ROLES BETWEEN, CAN BLUR THE BOUNDARY BETWEEN CLINICAL CARE AND RESEARCH. AND PATIENTS MAY EXPECT TO RECEIVE FOR EXAMPLE, MAY EXPECT TO RECEIVE ADDITIONAL DIAGNOSIS OR TREATMENTS IF THEY PARTAKE IN RESEARCH AND THIS IS KNOWN AS THE THERAPEUTIC MISCONCEPTION. THERE'S ALSO THE ISSUE OF THE POTENTIAL FOR COERCION WHERE PARTICIPANTS WHO ARE BEING RECRUITED FROM CLINICAL CARE SETTINGS, SO RESEARCHERS SHOULD BE AWARE OF THE POSSIBILITY OF COERCION WHERE PATIENTS OR THEIR FAMILIES MAY BE DESPERATE FOR ANSWERS OR TREATMENT. THEY MAY ALSO BELIEVE THAT THEIR DECISION TO PARTICIPATE ARE NOT IN A RESEARCH STUDY AND ACTUALLY AFFECT THEIR CLINICAL CARE. AND SO THESE PHENOMENA CAN INFLUENCE PEOPLE'S WILLINGNESS OR CONSENT TO PARTICIPATE. TO AVOID TAKING ADVANTAGE OF THE TRUST THAT'S BEEN ESTABLISHED IN THESE RELATIONSHIPS, IT'S IMPORTANT TO BE COGNIZANT OF HUMAN SUBJECT ISSUES WHEN RECRUITING IS BEING PERFORMED BOY CLINICIANS OR ADVOCACY GROUPS FOR EXAMPLE. SO TRANSPARENCY IN THE CONSENT PROCESS ABOUT POTENTIAL RISKS, BENEFITS, PRIVACY, SPECIFIC PRIVACY AND DATA SHARING POLICIES IS ESSENTIAL. AND INVESTIGATORS AND OTHER PEOPLE RECRUITING STUDY PARTICIPANTS SHOULD TAKE CARE TO ENSURE THAT POTENTIAL PARTICIPANTS FULLY UNDERSTAND WHAT THEY WILL AND WILL NOT GET OUT OF PARTICIPATING IN RESEARCH. SO ONE OF THE THEMES I'M GETTING AT HERE IS THAT RAPIDLY CHANGING GENOMIC TECHNOLOGIES AND THEIR USE IN RARE DISEASE RESEARCH COUPLED WITH THE TREND TOWARD BROADER DATA SHARING IS CREATING A UNIQUE SET OF RELATED HUMAN SUBJECTS ISSUES. AND I WANT TO HIGHLIGHT SOME OF THESE POINTS BY TALKING ABOUT, BY PRESENTING SOME QUALITATIVE DATA THAT MIGHT COLLEAGUE AT GPPC AND I HAVE RECENTLY COLLECTED THROUGH INTERVIEWS WITH HUMAN GENETIC RESEARCHERS AND BIOBANK READERS. AND OUR STUDY PARTICIPANTS WERE CLINICAL AND ACADEMIC RESEARCHERS, MANY OF WHOM WERE RECRUITED THROUGH THE AMERICAN SOCIETY OF HUMAN GENETICKICS, ALSO THROUGH ISBER THE INTERTHAT SOCIETY FOR BIOLOGICAL EXPERIMENTAL AND ENVIRONMENTAL REPOSITORY -- PRACTICES SURROUNDING SPECIFICALLY INFORMED CONSENT, PROTECTION OF PRIVACY, DATA SHARING AND THE RETURN OF INDIVIDUAL RESEARCH. NOW THIS LAST ONE, THE RETURN OF RESULTS IS SOMETHING THAT I'M NOT GOING TO TALK ABOUT TODAY IN THE INTEREST OF TIME BUT IT'S ANOTHER REALLY IMPORTANT CHALLENGE. AND SO MAYBE IF WE HAVE TIME FOR DISCUSSION LATER. SO ONE VERY COMMON CONCERN THAT WE HEARD EXPRESSED BY RESEARCHERS WE SPOKE WITH IS THE HETEROGENEITY OF PRACTICES WITHIN AND ACROSS INSTITUTIONS AND IRBS. AND MULTISITE STUDIES ARE PARTICULARLY AFFECTED BY THIS WHERE YOU HAVE MULTIPLE IRBs EVALUATING THE SAME PROTOCOL ALL WITH THEIR OWN OFTEN CONFLICTING REQUIREMENTS. AND MULTICENTER AND MULTISITE STUDIES ARE OFTEN ESSENTIAL IN RARE DISEASE RESEARCH. SO HERE WE HEARD MANY VARIATIONS ON THIS QUOTE. WE HAD TO GO THROUGH ESSENTIALLY 12IRBs AND EACH OF THE IRBS IS IN A DITCH STATE. AND -- IN A DIFFERENT STATE. SO WHILE GENOMIC RESEARCH ARE CHANGING RAPIDLY, SOME IRBs ARE LEARNING AND ADAPTING FASTER THAN OTHERS. AND THERE'S ALSO AN ISSUE THAT SOME ARE MAYBE LESS SAVVY WITH RESPECT TO GENETIC RESEARCH AND ASSESSING LEVELS OF RISK INVOLVED IN GENETIC RESEARCH. THESE DIFFERENCES IN EXPERIENCE AND FAMILIARITY WITH THIS TYPE OF RESEARCH CAN LEAD TO A WIDE RANGE OF INTERPRETATIONS ABOUT WHAT IS -- NOW I WILL BE TALKING ABOUT IRBs AND FRUSTRATION THAT I'M NOT TRYING TO ATTACK IRBs THEY HAVE AN EXTREMELY DIFFICULT JOB. AND JUST KIND OF, JUST TO SAY THAT SOME OF THESE INCONSISTENCIES CAN BE A REAL CHALLENGE FOR DOING THIS TYPE OF RESEARCH. SO ONE WAY THAT THIS CAN HAVE A PROFOUND EFFECT ON ENROLLING PARTICIPANTS IS WHEN THERE IS IRB OR HOSPITAL APPROVAL OFTEN IMPEDES ENROLLMENT FOR WILLING PARTICIPANTS IN THE STUDY. -- I WOULD SAY TODAY THAT'S THE BIGGEST ISSUE. SO AN EXAMPLE IS WHEN A PATIENT MAY SHOW UP AT A PHYSICIAN'S OFFICE, SAY IN A MORE REMOTE LOCATION WHERE THEY ARE UNFAMILIAR WITH RESEARCH PRACTICES OR WITH GENETIC RESEARCH WHERE WHEN RESEARCHERS ARE ACTUALLY, RESEARCHERS WHO ARE DOING THIS WORK ARE CONTACTED BY PHYSICIANS IN DIFFERENT U.S. CITIES OR IN DIFFERENT COUNTRIES, WHO HAS PATIENTS THEY WOULD LIKE TO ENROLL. THESE PHYSICIANS MAY NOT BE PERMITTED EITHER BY THEIR OWN IRB OR BY THE INVESTIGATOR'S LOCAL IRB TO CONSENT PARTICIPANTS. WHERE THEY MAY NOT HAVE THE TIME, THE RESOURCES OR THE DESIRE TO GO THROUGH THEIR OWN IRB BECAUSE IT'S BEEN SUCH A HASSLE. AND HERE, THIS RESEARCHER SAID HALF THE TIME IT'S THAT PHYSICIANS DON'T WANT TO SEND NEW PATIENTS BECAUSE OF ALL OF THE REGULAR STUFF THEY HAVE TO PASS THROUGH. AGAIN, AN EXAMPLE OF HOW INCONSISTENT STANDARDS CAN AFFECT RESEARCHERS, THIS RESEARCHER TOLD US IN THE PAST THE CONSENT COULD BE SENT TO PHYSICIANS AT OTHER SITES WHO WOULD THEN TALK TO THEIR PATIENTS ABOUT IT. AND MY CURRENT INSTITUTION THAT'S NOT ALLOWABLE. I HAVE TO PERSONALLY CONSENT EVERYBODY WHO IS IN MY STUDY. SO TO ENSURE THAT A CONSISTENT STANDARD OF CONSENT IS APPLIED TO ALL PAWRNTS PUNTS WHO ARE BEING RECRUDE FROM ALL AROUND THE WORLD, THIS VERSION'S REQUIRED BY THE INSTITUTE TO PERSONALLY PERFORM ALL THE CONSENTS. WELL IT'S ABSOLUTELY IMPORTANT FOR PEOPLE TO BE CONSENT BY SOMEONE WHO IS VERY KNOWLEDGEABLE ABOUT THE TODAY AND WHO MAY BE ABLE TO ANSWER ANY OR ALL QUESTIONS THAT SOMEONE MIGHT OR CONCERNS THAT SOMEONE MIGHT HAVE. ON THE OTHER HAND, THERE CAN BE REAL IMPEDIMENTS, INCLUDING LANGUAGE BARRIERS, DIFFERENT TIME ZONES, JUST SIMPLY LIMITED RESOURCES TO THE ABILITY FOR A SINGLE INVESTIGATOR TO PERSONALLY CONSENT ALL PEOPLE WHO WANT TO PARTICIPATE. RESEARCHERS WE SPOKE WITH TALKED ABOUT SOME IRB REVIEWER'S LACK OF, SORT OF LACK OF KNOWLEDGE OR FAMILIARITY WITH GENETIC AND GENOMIC RESEARCH, AND SOME DESCRIBED WHAT THEY THINK IS A SIGNIFICANT DISCREPANCY BETWEEN ARE PERCEIVED AND ACTUAL RISK OF PARTICIPATING IN GENETICS RESEARCH. AND OTHERS NOTED THE INCONSISTENCIES IN HOW IRBs ASSIGN LEVEL OF RISK. SO HERE THIS RESEARCHER SAID MANY OTHER REVIEWERS AND THE IRBs ARE NOT AS SAVVY AS PARTICIPANTS MIGHT BE. AFFECTED FAMILIES ARE LOOKING TO PARTICIPATE AND THEY WANT ALL THE INFORMATION THEY CAN GET. I THINK AT SOME LEVEL GENETICS IS BEING THOUGHT OF AS SOMETHING DIFFERENT THAN FAMILY HISTORY. AND REALLY THEIR CLOSE TIES. SO MANY RESEARCHERS ACTUALLY SUGGESTED THE IMPORTANCE OF HAVING OR THAT THEY WISH, THERE WERE GENETICISTS ON IRBs THIS REVIEWING THIS TYPE OF STUDY. SO TO ADDRESS SOME OF THESE ISSUES, WE HEARD REPEATED CALLS FOR THE DEVELOPMENT OF CONSENT TEMPLATES OR BEST PRACTICES FOR RECRUITMENT AND ENROLLMENT IN BROAD RESEARCH, AND THIS INCLUDED SUGGESTIONS FOR MORE, PLICIT GUIDELINES AND EVEN BOILERPLATE LANGUAGES, SPECIFIC TO GENETIC AND GENOMIC RESEARCH THAT CAN BE ADAPTED TO SPECIFIC SITES AND STUDIES. RESEARCHERS RECOGNIZE THE IMPORTANCE AND VALUE OF SHARING BROADLY. OF THEM WANT TO DO THIS. SOME AREN'T AS WILLING TO SHARE FOR REASONS THAT WE'VE HEARD. ALSO, AND IT'S BEEN OBSERVED AND REPORTED THAT RESEARCH PARTICIPANTS GENERALLY SUPPORT BROAD SHARING OF THEIR SAMPLES AND DATA. THE CHALLENGE THAT COMES WITH THIS TREND OF BROADER SHARING IS HAVING CLEAR AND CONSISTENT STANDARDS THAT PEOPLE CAN FOLLOW FOR PRIVACY AND THE CONFIDENTIALITY OF THEIR DATA. SO THERE'S A NEED FOR SOME UNIQUE SAFEGUARDS TO PROTECT AGAINST REIDENTIFICATION AND UNINTENDED DISSEMINATION OF PROTECTED HEALTH INFORMATION WHEN GENETIC AND GENOMIC DATA ARE LINKED TO A RICH ARRAY OF INFORMATION SUCH AS SPECIFIC PHENOTYPES, BEHAVIORAL SOCIAL HEALTH INFORMATION ELECTRONIC MEDICAL OR HEALTH RECORDS, ETCETERA. MAINTAINING THE PRIVACY AND ANONYMITY OF PARTICIPANTS IS A PARTICULAR CHALLENGE FOR RARE DISEASE RESEARCH. BECAUSE OF INHERENTLY SMALLER AFFECTED POPULATIONS. AND THESE PROTECTIONS ARE VERY IMPORTANT BECAUSE RARE DISEASE RESEARCH PAWRNTS PUNTS AND THEIR FAMILIES MAY ACTUALLY BE MORE VULNERABLE TO ADVERSE ECONOMIC OR LEGAL CONSEQUENCES, PSYCHOLOGICAL HARM AND VARIOUS FORMS OF DISCRIMINATION. MANY OF YOU MIGHT BE AWARE OF THIS PAPER THAT CAME OUT IN -- GENETIC IN 208 BY HOMERRER TO IDENTIFY AN INDIVIDUAL WITH A LARGE SET OF POOL OR AGGREGATE DATA. THERE HAS BEEN A COUPLE SUBSEQUENT PAPERS SHOWING THIS TYPE OF THING, IF THIS IS POSSIBLE. NIH WAS IN A TOUGH SPOT AND IN RESPONSE TO THIS QUICKLY RESTRICTED ACCESS TO AGGREGATE GENOMIC DATA THAT HAD PREVIOUSLY BEEN AVAILABLE, BEEN INCLUDED IN THE OPEN ACCESS PORTION OF GB GAP WHERE GWAS DATA, NIH FUNDED GWAS GENOME WIDE ASSOCIATION STUDY DATA ARE CURRENTLY REPOSITED. TWO YEARS PRIOR TO THIS DECISION THESE DATA HAD BEEN ACCESSED HUNDREDS OF TIME. SO IN MANY PEOPLE'S OPINION THIS IS A REAL BLOW TO THE ATTEMPT OF BROADER SHARING OF THESE TYPES OF, THESE GENOMIC DATA AND THESE RESOURCES. SO THERE'S A REAL POLICY DILEMMA RIGHT NOW WHEN DATA ARE DEIDENTIFIED AND CODED MEANING ALL PERSONAL IDENTIFIERS HAVE BEEN REMOVED ACCORDING TO HIPPA STANDARDS AND COMMON ROLES. THE RISK OF IDENTIFICATION MAY ACTUALLY BE GROWING WITH THE PROLIFERATION OF GENETIC DATA SETS THAT CAN BE LINKED TO ONE ANOTHER AND SOME MEDICAL RECORDS. SO SOME RESEARCHERS WE TALKED TO DON'T BELIEVE THAT THIS IS AT LEAST RIGHT NOW THIS IS A VERY PRACTICAL RISK, BUT EITHER WAY, THE INCONSISTENT SEE IN HOW THIS RISK IS INTERPRETED HAS BECOME A REAL CHALLENGE TO DOING THIS TYPE OF RESEARCH. SO THIS ISSUE OF HETEROGENEOUS STANDARDS ACROSS INSTITUTIONS AND IRBs APPLIES TO BOTH, TO RISK ASSESSMENT AND TO STANDARDS FOR PRIVACY PROTECTION THAT NEED TO BE IN PLACE TO MITIGATE THESE RISKS. NOW MORE THAN EVER, THERE ARE DIFFERENT OPINIONS AMONG BOTH INVESTIGATORS AND REVIEW BOARDS ABOUT WHAT IS IDENTIFIABLE AND WHAT THE ACTUAL RISK OF REREIDENTIFICATION IS. FOR EXAMPLE ONE RESEARCHER TOLD US THAT HE OR SHE MIGHT NOT EVEN TRY TO GET ACCESS, FOR EXAMPLE, TO SOME ANONYMIZED CLINICAL SAMPLES THAT HAVE BEEN COLLECTED OVER TEN YEARS AGO AND WERE SITTING IN A FREEZER BECAUSE THE IRB MIGHT CONSIDER THE SAMPLE'S IDENTIFIABLE DUE TO THE CONDITION BEING RARE AND WOULDN'T LET THEM USE THE SAMPLES WITHOUT GAINING CONSENT. AND IT MAY BE NEXT TIME POSSIBLE, IT'S ALSO NEXT TO IMPOSSIBLE TO TRACK DOWN THE ORIGINAL CLINICIAN OR THE PATIENTS WHO DONATED THESE SAMPLES TO RECONSENT THEM. SO RESEARCHERS, THIS IS A PROBLEM WHEN RESEARCHERS SAY THIS OBSTACLE TO USING THE CYCLE OF ANONYMIZED SAMPLES CAN PROBABLY GOES AGAINST WHAT THE PATIENTS WOULD HAVE ACTUALLY WANTED. THIS IS ANOTHER WAY TO DEMONSTRATE THE CONCERN THAT THIS ISSUE CAN IMPEDE RESEARCH. I HAVE SOME CONCERN THAT PEOPLE ARE ASSUMING THE GENETIC INFORMATION MAY NOT BE DEIDENTIFIABLE THAT WOULD SLOW DOWN RESEARCH BECAUSE COULD IT BE REIDENTIFIED POSSIBLY BUT HOW MANY DO PEOPLE HAVE TO GO THROUGH. I WOULD BE MORE CONCERNED ABOUT SOMEBODY GETTING INTO THE ELECTRONIC MEDICAL RECORDS SYSTEM. WE HAVE TOO MUCH ON THE JEETSZ AND I HOPE THAT DOESN'T REDUCE SOME OF THE RESEARCH OPPORTUNITIES. I WANT TO DEMONSTRATE THE IDENTIFIABLE ISSUE WITH THE EXAMPLE OF A WEBSITE THAT'S CREATED TO IDENTIFY RARE DISEASE MUTATIONS. THIS IS THE CFR2.ORG WEBSITE MY COLLEAGUE MICHELLE LEWIS WHO WAS VERY INVOLVED IN DEVELOPING THIS WEB WEBSITE. THESE INVOLVE IN -- CLINICAL DATA FOR THE CFTR2 DATA BASE COLLECTING INFORMATION ABOUT CF PATIENTS FOR NATIONAL CF REGISTRIES AND LARGE CF CLINICS AROUND THE WORLD. THEY HAVE IRB APPROVAL TO POST ON THE WEBSITE DEIDENTIFIED AGGREGATE DATA FROM PEOPLE IN THE CFCR2 DATABASE. THEY DO NOT HAVE CONTENT TO PUT INDIVIDUAL LEVEL INFORMATION ON THE WEBSITE. SO WHAT THEY USE IS A GENERAL RULE THIS SORT OF RULE OF FIVE WHICH IS THAT COMMONLY IN PUBLIC DATABASES THEY WILL NOT REPORT IN FEWER THAN FIVE PEOPLE WORLDWIDE. THE RISK OF IDENTIFY BILLITY IS TOO HIGH AND CAN YOU REALLY MAKE USEFUL CONCLUSIONS OR GENERALIZATIONS ABOUT THE MUTATIONS OF SO FEW CASES. BUT THE NEXT STEP OR THE CHALLENGE FACING THEM NEXT IS WHAT TO DO WITH MUTATIONS THAT DO APPEAR IN FEWER CASES. RESEARCHERS MAY WANT TO GIVE AFFECTED FAMILIES AS MUCH INFORMATION AS THEY CAN BUT THEY ALSO DON'T WANT TO GIVE INFORMATION THAT MAY MAKE REGISTRANTS UNIDENTIFIABLE OR INFORMATION THAT THEY DON'T YET UNDERSTAND AND MAY ACTUALLY BE MISLEADING TO SOMEONE, WHO SOMEONE WITH THAT MUTATION. SO SAY THERE ARE ONLY THREE PEOPLE IN THE WORLD TO HAVE A PARTICULAR CF MUTATION. EVEN IF YOU PUT AGGREGATE DATA ON THE WEBSITE BY LISTING SAY AVERAGE LUNG FUNCTION FOR THESE THREE PEOPLE. SAY I'M ONE OF THE THREE PEOPLE AND I KNOW MY LUNG FUNCTION IS 50 AND I LOOK ON THE WEBSITE AND SEE THE AVERAGE LUNG FUNCTION IS 80. I CAN TELL SOMETHING PRETTY SPECIFIC ABOUT THOSE OTHER TWO PEOPLE. IN ADDITION, SOME PEOPLE ARE USING SOCIAL MEDIA MORE AND MORE TO SHARE THEIR GENOME TYPE AND HEALTH INFORMATION AND KIND OF REACH OUT TO OTHER PEOPLE, LOOK FOR EXPLANATIONS AND ANSWERS. THAT'S CERTAINLY THEIR CHOICE BUT IT MAY AFFECT THE IDENTIFIABILITY OF OTHER PEOPLE, AND SO IN A CASE LIKE THIS, THE CHALLENGE TO THE RESEARCHER OR TO THE PEOPLE SETTING UP THESE DATABASES IS THE OBLIGATION TO PROTECT PEOPLE WHO DO NOT, WHO MAY NOT WANT TO SHARE OR BE IDENTIFIED. AND HERE I JUST, I JUST LOOKED ON, YOU KNOW, LOOKED ON THE INTERNET AND FOUND THESE ARE FROM A COUPLE OF THESE SORT OF PUBLIC FORUM ON THE INTERNET WHERE PEOPLE ARE REPORTING THEIR MUTATIONS ALONG WITH DETAILED SYMPTOMS AND EVEN GEOGRAPHIC LOCATIONS. IN THE FIRST EXAMPLE, THIS PERSON, THEIR SUBJECT LINE SHOWS THEIR SPECIFIC MUTATIONS, AND IT THEN CURRENTLY IN THE HOSPITAL WITH MY THREE MONTH OLD DAUGHTER DESCRIBE THE ACTUAL SYMPTOMS THAT SHE'S, YOU KNOW, PHENOTYPES THAT SHE'S EXPRESSING AND SAYS BASICALLY I'M JUST TRYING TO UNDERSTAND THE SEVERITY OF THESE GENES AND FIND OUT, WOULD LOVE TO HEAR FROM ANYONE ELSE WHO HAS THESE MUTATIONS. WE ARE LOCATED IN WELLINGTON NEW ZEALAND. THE SECOND ONE IS, WAS SORT OF THE SIGN OFF OR SIGNATURE FROM SOMEONE WHO WAS ANSWERING SOMEONE ELSE'S QUESTION ON A FORM AND HE SAID IN HIS SIGNATURE, HE GIVES HIS AGE, HIS AGE OF DIAGNOSIS, HIS SPECIFIC, HIS MUTATIONS AND VERY SPECIFIC PHENOTYPE OR DISEASE INFORMATION. SEE YOU CAN SEE THAT WHILE THIS IS PERSONAL CHOICE IT COULD EXREE MISE THE PRIVACY OF OTHER PEOPLE WHO MAY NOT BE READY OR WILLING TO SAY THEIR INFORMATION. ANOTHER PRIVACY RELATED CONCERN THAT SEEMS TO COME UP A LOT OF IS THE FEAR OF LOSS OF HEALTH INSURANCE. AND EVEN THOUGH WE'RE TALKING ABOUT RESEARCH, OFTEN CLINICAL RESEARCHERS WILL GENERATE A CLINICAL REPORT ESPECIALLY IF THEY'RE CONFIRMING SOME ON THEIR FINDINGS IN A PRECERTIFIED LAB OR POSSIBLY RETURNING THEM TO PATIENTS OR TO THE PATIENT'S PHYSICIAN. SO THE REALITY, AND THE REALITY IN THIS IS THAT IN THE CASE OF RARE DISEASES, THERE'S ALREADY GOING TO BE A DIAGNOSIS OR AT LEAST A DESCRIPTION OF SYMPTOMS SOMEWHERE IN THE PATIENT'S MEDICAL HISTORY. BUT PEOPLE TEND TO VIEW GENETIC INFORMATION DIFFERENTLY. AND ARE UNIQUELY AND SEEM TO BE UNIQUELY AFRAID OF IT GETTING INTO THE WRONG HANDS OR THE HANDS OF INSURERS AND BEING USED AGAINST THEM. SO ONE RESEARCHER, ONE RARE DISEASE CLINICIAN AND RESEARCHER WE SPOKE WITH SAID THAT THIS CAME UP ALL THE TIME. SO IT'S IMPORTANT TO ADDRESS THESE ISSUES WHEN YOU'RE TALKING TO POTENTIAL PARTICIPANTS IN RESEARCH. AND TO CLEARLY EXPLAIN THE PROTECTIONS THAT ARE IN PLACE AND ALSO THEIR LIMITATIONS. SO TO EXPLAIN THE PROCESS OF THE IDENTIFICATION AND CODING OF DATA, AND THERE'S ALSO, THERE'S A MOVEMENT TOWARD, REALLY TOWARD MORE TRANSPARENCY AND CONSENT ABOUT THESE RISKS AND INSTEAD OF PROMISING COMPLETE OR PURE PRIVACY TO BE CLEAR THAT THERE ARE MINIMAL RISKS. THERE'S ALSO CERTIFICATES CONFIDENTIALITY WHICH NOT EVERYBODY KNOWS ABOUT, RESEARCHERS THEMSELVES AND CERTAINLY PARTICIPANTS THAT YOU CAN OBTAIN FROM NIH AND IT'S AN EXTRA LAYER OF PROTECTION FOR PARTICIPANTS. IT GIVES RESEARCHERS OR STAFF FOR EXAMPLE THE ABILITY TO REFUSE TO DISCLOSE ANY IDENTIFYING INFORMATION TO LAW ENFORCEMENT IF THEY WERE ASKED. AND OF COURSE THE GENETIC INFORMATION NON-DISCRIMINATION ACT WHICH PROTECTS AGAINST DISCRIMINATION BY HEALTH INSURANCE COMPANIES AND MOST EMPLOYERS BUT DOES NOT APPLY TO LIFE INSURANCE DISABILITY OR LONG TERM CARE INSURANCE. SO DR. RUBENSTEIN AND THE OFFICE WHERE DISEASE RESEARCH ARE VERY MUCH AWARE OF THESE ISSUES, AND WORKING ON POSSIBLE SOLUTIONS ABOUT A YEAR AND-A-HALF AGO, THEY HELD A WORKING GROUP DEVOTED TO RECOMMENDING TEMPLATES FOR SHORT SIMPLE AND CLEAR INFORMED CONCEPT FOR PATIENT REGISTRIES AND RECOMMENDING WHAT ELEMENTS SHOULD BE INCLUDED IN INFORMED CONSENT AND WHAT TYPES OF INFORMATION AND MATERIALS SHOULD BE AVAILABLE TO PARTICIPANTS, TO POTENTIAL RESEARCH PARTICIPANTS. AND THIS IS HOW DR. RUBENSTEIN, HOW THEIR GRDR CONSENT TEMPLATE HAS HANDLED THIS. THERE IS RISK IN THE BODY OF THE CONSENT DOCUMENT BUT AGAIN THERE'S ALSO AT THE END, THERE'S ALSO THIS BOX TO CHECK ALONG WITH A NUMBER OF BOXES TO CHECK, THERE'S A WAY OF VERIFYING OR REASON FORCING COMPREHENSION OF THESE STATEMENTS HERE THAT I UNDERSTAND ALL ATTEMPTS WILL BE MADE TO PROTECT MY PRIVACY. I THIS IS MY PERSONAL INFORMATION WILL BE PROTECTED AND SAVED IN THE REGISTRIES AND CODE. HOWEVER THERE IS A VERY SMALL RISK THAT MY PERSONAL INFORMATION COULD BE REVEALED. AND THIS IS EXACTLY THE TYPE OF THING THAT NEEDS TO BE DONE SORT OF THIS MEETING OF THE MINDS DEVELOPING TEMPLATES AND KIND OF GUIDELINES. BUT WHAT'S HAPPENING IS THAT IT SEEMS THAT PEOPLE ARE DOING IT IN MANY DIFFERENT PLACES IN ISOLATION AND SORT OF HAVING TO REINVENT THE WHEEL EACH TIME. SOMETHING THAT MAY HELP WITH THIS IS THAT RECENTLY LAST SUMMER THERE WAS AN ADVANCED NOTICE OF PROPOSED RULE MAKING THE FEDERAL GOVERNMENT ANNOUNCED THEY WERE CONTEMPLATING CERTAIN CHANGES TO THE COMMON RULE WHICH IS THE FEDERAL POLICY FOR THE PROTECTION OF HUMAN SUBJECTS. AND AMONG THE PROPOSED REGULATORY REFORMS ARE THESE THREE CHANGES WHOSE NEEDS IS REALLY UNDERSCORED BY OUR FINDINGS AND OUR RESEARCH AND WHAT I'VE BEEN TALKING TO YOU ABOUT TODAY. THE FIRST IS GREATER SPECIFICITY ON CONSENT FORM CONTENT WITH INCREASED TRANSPARENTITY MEANING FOR ONE KIND OF SHORTER SIMPLE LANGUAGE. AND UNIFORMED STANDARDS FOR DATA SECURITY PROTECTIONS, CAL GRADED TO THE LEVEL OF IDENTIFIABILITY AND A SINGLE IRB OF RECORD FOR ALL U.S. STIETS IN A MULTISITE STUDY. THESE PROPOSALS ARE VERY ENCOURAGING BUT IT'S UNCLEAR IF AND WHEN THEY MIGHT ACTUALLY BE MADE. SOME OF YOU MIGHT HAVE MORE INSIGHT ON THAT. SO I WANT TO JUST WRAP UP HERE BY TALKING ABOUT REMINUTING CHANGES AND WHAT THERE IS THE FUTURE. THAT IS STRIKING A BALANCE BETWEEN PRIVACY RISK AND ALSO ALLOWING BROAD ENOUGH ACCESS TO DATA AND SAMPLES IN ORDER TO FACILITATE SCIENTIFIC AND MEDICAL ADVANCES. AND MORE SPECIFIC CONSISTENT GUIDELINES FOR CONSENT, PRIVACY AND DATA SHARING. AND THESE CAN INCLUDE ADAPTABLE TEMPLATE CONSENT FORMS THAT WE'VE TALKED ABOUT AS WELL AS SOME STANDARDIZED, AND WE HEARD THIS EARLIER, SOME STANDARDIZED MINIMUM REQUIREMENTS FOR PRIVACY PROTECTIONS AND EVEN LIKE IT SYSTEMS, LIKE LENS OR ENCRYPTION SOFTWARE, IF THERE WERE SOME STANDARDS THAT RESEARCHERS CAN TURN TO FOR THIS, THAT WOULD BE HELPFUL. BUT THAT ALSO ALLOWED LOCAL FLEXIBILITY. ENGAGEMENT WITH RESEARCH PARTICIPANTS TO MEASURE COMPREHENSION COMFORT AND SATISFACTION ABOUT THE ENROLLMENT AND CONSENT PROCESS AND ABOUT SPECIFIC STUDY PROTOCOLS. AND FINALLY BETTER ONGOING TRAINING FOR INVESTIGATORS AND IRB REVIEWERS AND THIS PERSON, SOME PEOPLE SAID I'M NOT SURE WE NEED MORE GUIDELINES THERE'S A NEED FOR BETTER TRAINING AND UNDERSTANDING AND PEOPLE INVOLVED IN THE PROCESS. SO I JUST WANT TO THANK MY COLLEAGUES AT THE GENETIC AND PUBLIC POLICY CENTER AND ALL OF OUR STUDY PARTICIPANTS AND DR. RUBENSTEIN FOR INVITING ME AND PUTTING THIS TOGETHER. SO IF YOU HAVE ANY QUESTIONS. [APPLAUSE] >> THERE'S NO, ANY QUESTIONS. I WANT TO THANK THE SPEAKERS WHO GAVE WONDERFUL AND EXCELLENT PRESENTATION HERE. AND THANK YOU ALL FOR COMING. UNTIL NEXT TIME. THANK YOU AND STAY TUNED.