THANKS, EVERYONE FOR TAKING THAT QUICK LUNCH BREAK. AS I MENTIONED BEFORE LUNCH, THE NEXT TALK IS GOING TO BE A FOCUS ON AN INTRODUCTION TO THE METADATA REPOSITORY AND DR. JAY GREENFELD IS GOING TO BE GIVING US THAT PRESENTATION. THEN WE'LL PAUSE AT THE END OF HIS PRESENTATION FOR A DISCUSSION SESSION AND QUESTION-AND-ANSWER, THEN TURN IT OVER TO MR. WAYNE KUBICK, CTO OF CDISC WHO WILL TALK STANDARDS IN METADATA. WITH THAT JAY, I WILL TURN IT TO YOU. >> THE METADATA REPOSITORY PICKS US AND INTEGRATES WITH THE MTS AND WORKBENCH THAT WAS PREVIOUSLY DISCUSSED. IT HAS A SLIGHTLY DIFFERENT AUDIENCE, IT'S MORE AIMED AT THE RESEARCH COMMUNITY, NOT SO MUCH AT THE STUDY CENTERS. BUT THERE ARE WORK PRODUCTS IN THE MDR THAT MIGHT BE USEFUL TO THE STUDY CENTERS TOO. SO IT'S A DATABASE. AND A WEBSITE. AND IT TRAFFICS IN DATA ELEMENTS AND FOLLOWS THE DATA ELEMENTS FROM CONCEPTION THROUGH DATA COLLECTION, DATA PROCESSING AND DISSEMINATION. BUILT ON TOP OF DATA DOCUMENTATION INITIATIVE, THE STANDARD AS WELL AS ISO 11/1/79. WHEN WE THINK OF METADATA REPOSITORIES WE THINK OF SOMETHING LIKE THIS. THIS IS THE CD, BROWSER FROM THE CADSR. COINCIDENTALLY WE SEE HERE A TERM THAT WAS INTRODUCED TO THE CADSR BY NICHD, ONE OF THOSE CONTRIBUTIONS THAT DR. HIRSCHFELD TALKED EARLIER THAT NECHD IS MAKING. WE MIGHT ALSO WHEN WE THINK ABOUT METADATA REPOSITORY THINK ABOUT SOMETHING LIKE THIS. THIS COMES FROM THE NCO BIOPORTAL, DR. PADULLA MENTIONED THIS PARTICULAR TERMINOLOGY BEFORE, IT'S THE N ISHCHD PEDIATRIC TERMINOLOGY. AND IT'S BEEN LOADED ON TO THE BIOPORTAL. THE LONGITUDINAL MDR FOLLOWS DATA ELEMENTS IN A DIFFERENT CONTEXT. IT DOESN'T LOOK AT DATA ELEMENTS IN THE CONTEXT IN FORMS. IT LOOKS AT DATA ELEMENTS IN THE CONTEXT OF A STUDY. IN THE NCS THE STUDY HAS MANY PHASES. SO IT BEGINS WITH RECRUITMENT, GOES THROUGH PRE-CONCEPTION, PREGNANCY, BIRTH, AND THEN THOSE ARE ARBITRARY DESIGNATIONS FIRST TWO YEARS AND YEARS 2 TO 21. THEN YOU CAN DRILL DOWN INTO THE COLLECTION EVENTS AND INSTRUMENTS AND GET AT THE DATA ELEMENTS. SO IT DOCUMENTS DATA COLLECTION OVER TIME. THE WAY IT DOES THIS, IT USES SOME OF THOSE STUDY OBJECT THAT WENDY WAS TALKING ABOUT. THAT SLIDE DOESN'T SHOW UP. SORRY, LET ME JUST LOOK AHEAD TO SEE IF I HAVE ANY MORE ISSUES. THIS SLIDE WAS SUPPOSED TO SHOW ALL STUDY OBJECTS AND TAGGING THAT WE DO AGAINST THOSE STUDY OBJECTS. IT'S VERY INFORMATIVE RIGHT NOW. I WILL KIND OF MOVE ON TO THE NEXT SLIDE WHICH PROBABLY DOESN'T HAVE A LOT OF MEANING WITHOUT THEM. THIS GOES BACK TO THE WORK THAT DR. PADULLA AND OTHERS HAVE DONE. YOU CAN SEE WHEN WE'RE TALKING ABOUT THE PROTOCOL AND FOLLOWING A PARTICIPANT, WE'RE ACTUALLY USING THE WORK THAT THAT GROUP DID. WE'RE RELATING COLLECTION EVENTS TO THE CHILD LIFE STAGES. THE NDR FACILITATES DATA MANAGEMENT OVER TIME. WE SAW THIS GRAPHIC BEFORE FROM WENDY. WHAT WE DESCRIBE SO FAR ARE STUDY CONCEPTS AND DATA COLLECTION. IN ADDITION THE MDR ALSO LOOKS AT DATA MANAGEMENT OVER TIME. IN THE PROCESS WHAT IT DOES IS TAKES THE DATA OR DESCRIBES THE COLLECTED DATA AND WHAT HAPPENS TO THAT COLLECTED DATA AFTER IT'S COLLECTED. IT'S ARCHIVED, THERE'S A OF PROCESSING THAT THE COLLECTED DATA GOES THROUGH THAT LEADS TO DATA DISTRIBUTION AND DISSEMINATION. DATA SETS, AND THEN EVENTUALLY DATA DISCOVERY AND ANALYSIS. WHAT ARE THESE PROCESSING EVENTS? THERE ARE MANY. THEY INCLUDE EDIT CHECKS DURING DATA COLLECTION, SCHEMA VALIDATION, EDIT CHECKS DURING DATA INTEGRATION. WE HEARD THIS IN THE EARLIER TALKS CLEANING OPERATIONS. WE HAVEN'T REALLY GOTTEN TO DOING THAT KIND OF STUFF YET, WAITING, CODING, AND WITHIN CODING HANDLING NON-RESPONSE QUESTIONS, RECODES, AND IMPEWATION DERIVATIONS. WHAT WE WOULD LIKE TO DO, THIS IS A FUTURE, BUT WILL SEE SOON WHAT WE HAVE ACTUALLY DONE AND WHAT WE MIGHT HOPE TO DO AS WE GO FORWARD. IS BUILD A TIME LINE OF PROCESSING EVENTS. SO AS NEW VERSIONS OF THE WORKBENCH COME OUT AND CHANGE EDITS THAT WE ARE USING TO KEEP THAT TIME LINE, NOT UNLIKE THE TIME LINE ANDREW WAS DESCRIBING AND WE MAKE THAT TIME LINE AVAILABLE TO THE RESEARCH COMMUNITY SO THEY KNOW THE PRAVENANSE OF EACH DATA ELEMENT IN THEIR DISSEMINATION DATA SET. I'M GOING TO SKIP THIS SLIDE FOR THE SAKE OF BREVITY. LET'S LOOK AT SOME SCREEN SHOTSCH THIS IS THE WELCOME PAGE. ON ONE SIDE IT HAS SOME VERBIAGE THAT TALKS ABOUT THE MISSION OF THE METADATA REPOSITORY AND THEN IT ALSO EXPOSES SOME OF THE FUNCTIONALITY RIGHT ON THE WELCOME PAGE. SO THERE'S AN ABILITY TO SEARCH FOR DATA ELEMENTS, THERE'S ABILITY TO DOWNLOAD A DATA DICTIONARY. WE'LL LOOK AT THAT A LITTLE MORE IN A MINUTE. AND THEN DOWNLOAD A CHANGE LOCK. SO THIS IS THE DATA DICTIONARY DOWNLOAD PAGE, FOR EACH VERSION OF THE N DS, THERE ARE THREE DATA DICKNARYS AND A WAY TO SORT OF BRING THEM IN TOGETHER, DOWNLOAD THEM ALL. EACH ONE HAS A DIFFERENT FORMAT. SO ONE DATA DICTIONARY IS AN HTML DATA DICTIONARY. IT LOCATES DATA ELEMENTS IN TABLES BUT IT ALSO HAS TAG THOSE DATA ELEMENTS WITH POPULATIONS AND CONCEPTS IN LINE WITH THE ISO 11/1/79 STANDARD. IF YOU WERE TO CLICK ON BIRTH FACILITY FOR EXAMPLE, YOU GO OUT TO THE NCI THESAURUS AND YOU CAN FIND OUT WHAT THE DEFINITION OF BIRTH FACILITY IS. THIS IS ONGOING WORK WE'RE DOING WITH THE CURRENT INSTRUMENTS I THINK WE COMPLETED THE 3, 6, 9 AND 12 MONTH INSTRUMENTS SO FAR. AS AN ASIDE, NOTICE THE CONTRIBUTING SOURCE TO THIS CON SEPTEMBER IS NIECHED NICHD. SO THIS IS WORK NICHD IS DOING TO INFORM THE CADSR. THIS IS THE MASTER DATA ELEMENT SPECIFICATION THAT WE TALKED ABOUT EARLIER. THIS ONE IS KIND OF INTERESTING NOT TO A HUMAN BEING BUT PERHAPS TO A COMPUTER PROGRAM. SO THIS IS THE DDI XML SPECIFICATION OF THE MDS. WE SO FAR HAVE BEEN ABLE, I DON'T KNOW THAT FOLKS LIKE WARREN HAVE HAD A CHANCE TO USE TO SEE WHAT IT DOES AND DOESN'T DO BUT WE'RE ANXIOUS TO PUT IT INTO THE HANDS OF FOLK WHOSE DO WORK WITH METADATA IN A COMPUTATIONAL WAY, TO SEE WHAT IT HAS AND HOW IT CAN BE IMPROVED. THIS IS THE BROWSE STUDY PAGE THAT WE SHOWED BEFORE. AND LIKE TO -- OH, NO. THAT'S TERRIBLE. WE SAW THAT SLIDE BEFORE. I DON'T KNOW IF YOU CAN SEE THIS BUT THE -- I WANTED TO DISCUSS A LITTLE BIT ABOUT THE WEBSITE ARCHITECTURE. AND WHAT WE HAVE IN THE MIDDLE OF THE ARCHITECTURE IS THE MDS AND WHAT WE DO, WE TAKE THE MDS WITH ITS OPERATIONAL DATA ELEMENTS, AND ITS INSTRUMENT DATA ELEMENTS AND INSTRUCTIONS YOU SAW BEFORE HOW TO COMPLETE THE OPERATIONAL DATA ELEMENTS AND COMPLETE THE INSTRUMENT DATA ELEMENTS. AND WE LOAD THAT INTO A DDI COMPLIANT DATA STORE. IN THE PROCESS THE METADATA GETS RESTRUCTURED. AND IT GETS RESTRUCTURED AND IN THE RESTRUCTURING WE NOT ONLY HAVE THE DIFFERENT STUDY OBJECTS BUT WE CREATE THE RELATIONSHIPS THAT THOSE STUDY OBJECTS MIGHT HAVE TOGETHER. SO FOR EXAMPLE, THE VARIABLES THAT WE BRING IN TO THIS STORE HAVE REFERENCE BACK TO CONCEPTS. SO THAT'S A RELATIONSHIP. SO YOU DON'T HAVE AS WENDY SAID, WITH THE VARIABLES YOU DON'T PUT THE CONCEPT, YOU HAVE A REFERENCE TO A CONCEPT. AND SO VARIABLES ALSO CAN REFERENCE QUESTIONS. SO YOU HAVE AN ID THAT GOES WITH THE VARIABLE THAT REFERENCED THE QUESTION. AND A PARTICULAR VERSION OF THE QUESTION. SO WE GO THROUGH THE ELABORATE RESTRUCTURING PROCESS, WHEN WE BRING THE METADATA TO THE DDI COMPLIANT STORE. THEN IN ADDITION TO THAT, THIS STORE IS NOT VERY WELL SUITED FOR RETRIEVAL. IT'S XML LIKE. SO WHAT WE DO IS WE WHAT WE DO THEN IS PUT IT TO YET ANOTHER DATA BASE AND ANOTHER DATABASE RETRIEVE THE PAGES FOR THE WEBSITE. THERE ARE A COUPLE OF FUTURES YOU CAN'T SEE. TO THE DATA STORES AND MAKE THEM AVAILABLE WITH THESE TIME LINES TO USERS. ANOTHER FUTURE WE HAVE, I HOPE WE CAN SEE THIS ONE. OKAY. WE CAN GENERATE ANRDS TRIPLE STORE FROM THE DDI COMPLIANT DATA STORE. THAT'S SOMETHING THE DDI COMMUNITY IS DEVELOPING AND ONE SPRAWR WHO SEASON HERE INVOLVED IN DOING THE MAPPING OBJECTS AND DDI STORE AND RDF OBJECTS. THE RDF OBJECTS THE TRIPLES CONSIST OF A SUBJECT A PRED KIT AND OBJECT. HERE YOU SEE VARIABLE AGE WHICH REFERENCES A QUESTION WHAT IS AGE. WHAT CAN WE DO WITH SOMETHING LIKE THAT? WE CAN ACTUALLY QUERY THE TRIPLES TO DETERMINE FOR EXAMPLE WE CAN EXPLORE THE COMPARABILITY OF THE DATA ELEMENTS WHICH IS WHAT THIS PARTICULAR QUERY IS DOING. SO SOME OF THE CHALLENGES AHEAD THAT WE FACE, I NEVER KNOW WHETHER THE NEXT PAGE WILL BE BLANK OR NOT. ONE OF THE CHALLENGES THAT WE HAVE TALKED ABOUT TODAY A LOT IS HOW YOU REPRESENT A DATA ELEMENT IN A LONGITUDINAL STUDY, YOU SEE BLOOD PRESSURE AS AN EXAMPLE AND BLOOD PRESSURE HAS DATA WE'RE FAMILIAR WITH. BUT IT ALSO HAS TAKING THE BLOOD PRESSURE INVOLVES A PROTOCOL. THERE'S A DEVICE, THERE'S A LOCATION, A CUP SIZE AND SO FORTH AND SO ON. IN ADDITION TO THE HOW, THERE'S ALSO A STATE OR A WIN IT SEEMS LIKE IF YOU ARE TAKING BLOOD PRESSURE OR MEASURING SOMETHING OVER TIME, YOU MAY NEED TO HAVE MORE INFORMATION ABOUT WHAT THAT IS THAN JUST WHAT DATA IT SELECTS AND A DESCRIPTION OF IT. AND A REFERENCE TO A CONCEPT AND A REFERENCE TERMINOLOGY. SO THAT IS ONE OF THE CHAT LENGS THAT I THINK WE CONFRONT AS WITH -- AS WE GO FORWARD. ANOTHER ONE IS IN LINE WITH THE DISCUSSION ABOUT DATA TYPES. YOU CAN SEE HERE, THIS IS AN EXAMPLE, I DON'T THINK WE'RE DOING ANYTHING LIKE THIS TYPE OF OBSERVATION WITH INFANTS. THE NEONATAL SCALE ON THE RIGHT YOU SEE THE DATA. AND IT CONSISTS OF OBSERVATIONS, THE OBSERVATION OF THE CHILD THE MOISTURE OF THE CHILD, THE SKIN, SO FORTH AND SO ON. IT ALSO INCLUDES A TOTAL SCORE BASED ON THE OBSERVATIONS. IT ALSO INCLUDES SOMETHING CALLED A GRADED RISK WHICH IS WHERE YOU LOCATE A CHILD GIVEN A CERTAIN TOTAL SCORE. WE CAN IMAGINE EXAMINATIONS AND SCALES ALSO CHANGE OVER TIME. THIS ONE MIGHT NOT BUT OTHERS STILL MATURING DO AND WILL YOU NEED TO CAPTURE THIS KIND OF INFORMATION IN ORDER TO BE ABLE TO ROLL UP DATA FROM SCALES THAT CHANGE OVER TIME SO WE CAN REASON ABOUT WHAT'S HAPPENING TO THE CHILD. THEN THE LAST ONE IN LINE WITH THIS, IT'S THIS OTHER DATA TYPE WE TALKED ABOUT THIS MORNING A LITTLE BIT TOO. HEALTH DEVELOPMENT TRAJECTORY TRAJECTORYIES, THE IDEA IS WE MAY USE THESE EVENTUALLY AND ADAPTIVE PROTOCOL TO TRIGGER PERSPECTIVE STUDIES FOR SUB SETS OF PARTICIPANTS. HOW CAN CAN WE BE SURE AS WE GROW THESE OBJECTS BASED ON NEW KNOWLEDGE, THAT OUR STUDIES DON'T CONTAIN APPLES AND ORANGES WE'RE GOING MATURE UNDERSTANDING OF THESE OBJECTS AN DATA TYPES OVER TIME. THE REASON OR ONE OF THE REASONS WE HAVE THIS CONFERENCE OR WORKSHOP TO GET INPUT FROM U YOU WHICH WE CAN ALL TAKE AND FIGURE OUT WHAT WE'RE GOING TO BE DOING NEXT. SO WE'RE LOOKING FORWARD AN EXCITED ABOUT YOUR PARTICIPATION IN THIS WORKSHOP. THANK YOU. >> QUESTIONS FROM THE GROUP. >> I HAD A BRIEF COMMENT. ALSO WORKING WITH A GROUP DOING THE RDF TRIPLE AND WE DID MEET THE BEGINNING OF DECEMBER AND HAVING PHONE CALLS SINCE, WE SHOULD BE GETTING A PRELIMINARY SET OUT SOMETIME THIS SPRING, AROUND IDD. SO IN MAY OR JUNE. >> WE HAD A PREVIEW OF YOUR WORK AND WE HAVE BEEN ABLE TO EXAMINE ONE TRANSLATION BUT I DON'T THINK THAT'S THE OFFICIAL TRANSLATION. >> A GENERAL QUESTION FOR MY OWN EDIFICATION. THExDD2¨ RDF FRAMEWORK, ARE THERE OTHER EXAMPLES YOU USE FOR MODELING THESE TYPES OF DATA? A LITTLE FAMILIAR WITH THE HL-7 REFERENCE INFORMATION MODEL DOING, ARE THERE OTHER SIMILAR EXAMPLES OR HOW DO PEOPLE GO ABOUT, IS THERE EXTERNAL REFERENCE TO MODEL THESE THINGS SEPARATELY? >> IF YOU HOLE THAT THOUGHT I'M GOING TO TALK ABOUT THAT IN THE NEXT SESSION. MAYBE WE CAN TALK ABOUT IT THEN. >> WHERE IN OUR INFANCY -- WE'RE IN INFANCY WITH USING THE RDF STORES AND WE HAVE A FEW EXAMPLES BUT NOT A LOT. BUT THERE'S A LARGER COMMUNITY HERE A LOT OF PEOPLE WHO HAVE A LOT MORE KNOWLEDGE. >> THERE ARE MANY REASONS FOR THIS. WHAT ARE PARTICULAR MOTIVATIONS TO GO INTO AN RDF VERSION? DEMAND FROM GROUP? >> NO, I THINK IT'S ACTUALLY TRYING TO SCOPE OUT WHAT'S POSSIBLE. WE MAY PRUNE AND WILL PRUNE WHAT WE'LL EPIUP DOING. RESOURCE CONSTRAINTS AND NEEDS THAT OTHER FOLKS HAVE. WE HAVE AN AUTOMATED WAY OF BEGINNING TO EXPLORE WHAT WE HAVE IN THE WAY OF THE DATA ELEMENTS BECAUSE YOU CAN'T THINK AHEAD AND DO EVERYTHING PROSPECTIVELY. SOMETIMES YOU HAVE TO DO THINGS RETROSPECTIVELY. THE LAST PHASE OF THE DATA, WENDY TALKED RETROSPECTIVE EVALUATION. SO YOU MIGHT USE RETROSPECTIVE EVALUATION, YOU MIGHT TRY TO REASON OVER SOME OF THE THINGS THAT YOU DISCOVERED WHEN YOU WERE DOING DATA ANALYSIS TO THINK ABOUT THE WAYS TO CHANGE CONCEPTS OR REDEFINE THEM. >> ONE OF THE REASONS TO CONSIDER USING RDF, IT ISN'T JUST TO SIMPLIFY THE QUERY, CREATING A SPARKLE END POINT BUT I WAS SAYING THE WARRANT ALSO -- BEING ABLE TO EMPLOY SOME OF THE GENERAL PURPOSE REASONERS THAT ARE OUT THERE NOW. PEOPLE WORKING ON THESE THINGS AND TO THE EXTENT TO MINE OR EXPLORE DATA USING NON-DIRECTED METS ALLOWS US TO REUSE STUFF ALREADY IN THE COMMUNITY. >> THEN THE IDEA OF RDF BEING MORE PUBLICATION FORMAT. YOU (INAUDIBLE) LATER ON IN RDF FORMAT, YOU TAKE XML AND MORE (INAUDIBLE) TO RDF? >> I'M NOT SURE THAT -- I MEAN AS WE STAND NOW, LOOKS LIKE YOU CAN HAVE BOTH AND RDF CAN SHADOW THE XML. SO IT COULD BE A WAY OF PUBLISHING LIKE YOU SAY, THE XML. >> BUT IT'S A TRANSFORMATION END POINT, NOT A SECOND STORE. >> HOW DEI IS LOOKING AT IT, WHILE OUR INTENT HAS BEEN TO PUBLISH AND DESIGN A UML MODEL SO YOU HAVE A UML MODEL WITH AN XML EXPRESSION AS PROBABLY PRIMARY EXPRESSION MODE. ONE REASON WE'RE DOING TRANSLATION IS SIMPLY AS ANOTHER IMPLEMENTATION, ANOTHER EXPRESSION OF CONTENT MODEL SO NOT A NEW MODEL. SIMPLY A NEW EXPREGNANT OF IT. PRIMARILY FROM OUR PERSPECTIVE TO FACILITATE INTERACTION WITH THE WEB. THE WEB GUYS KNOW RDF AND IF YOU LEAVE IT TO THEM THEY'LL MAKE UP THEIR OWN. WE'D LIKE THEM TO USE SOMETHING THAT'S ALREADY EXISTING AND COORDINATED WITH STANDARDS. >> WHAT RDF DOES WELL IS EXPRESS RELATIONSHIPS AN EQUIVALENCY, SO WE HAVE SO MANY CASES WHERE WE HAVE THE SAME TERMINOLOGY AND CONCEPTS THAT WE HAVE DIFFERENT NAMES FOR. RDF IS GOOD FOR EXPRESSING METADATA OR RELATIONSHIPS BETWEEN METADATA. >> WE CAN GO DOWN TO THE THE WEEDS TO TALK ABOUT ALL THE COMPLICATED RELATIONSHIPS THAT YOU END UP NEEDING TO EXPRESS.-?z A LONGITUDINAL STUDY. SO THOSE WHO STAIR AT XML MIGHT LIKE SOME OTHER WAY OF GOING ABOUT THINKING AND LOOKING AT WHAT'S GOING ON. >> I WOULD AGREE A LOT WITH THAT. LATELY WE LOOKED INTO CONCEPT MANAGEMENT AND ALSO CLASSIFICATION MANAGEMENT WHICH I SEE AS FUNDAMENTAL TO ANY DATA MANAGEMENT SYSTEM. AND CLEARLY WHEN YOU LOOK AT THAT LEANING TOWARDS RDF IT'S MUCH MORE FUNCTIONAL AND FLEXIBLE WHEN YOU (INAUDIBLE) SO A LOT OF ROOM LINK TO CONCEPTS AND CLASSIFICATION. SO IT WILL BE INTERESTING TO KNOW, THAT'S WHY I WAS ASKING HOW YOU'RE PLANNING TO USE IT, ARE YOU STILL -- I THINK AT THIS POINT EXPLORATORY. IT IS VERY INTERESTING AND THERE'S THINGS TO DO THERE. >> THERE'S ALSO THE SEMANTIC WEB WHICH IS ONTOLOGY. >> ANY OTHER COMMENCE OR QUESTIONS FROM THE GROUP? JUST ONE LAST I THINK QUESTION BUT ALSO SOMEWHAT A SEGUE INTO WAYNE'S TALK. AS WE THINK ABOUT THE NCS METADATA REPOSITORY, WHAT STEPS OR PRACTICES WE WANT TO KEEP IN MIND IN LOOKING AT THE REUSABILITY OF THE METADATA REPOSITORY TO OTHER STUDIES AS WELL. WE'RE FOCUSING RIGHT NOW TO NCS BUT I THINK IN OUR DISCUSSION AROUND STANDARDS HOW USING 1/11/79 DDI, UML MODEL, THERE ARE THESE STANDARDS OUT THERE THAT WE WANT TO MAKE SURE WE EMPLOY FOR THE METADATA REPOSITORY TO ENSURE REUSABILITY. SO WITH THAT, I HAVE ALREADY INTRODUCED WAYNE AND WILL TURN IT TO HIM AS SOON AS HIS SLIDES AND MICROPHONE ARE WIRED UP. THANKS. >> >> I CAN'T WAIT TO SEE MY SLIDE AFTER JAY'S TALK. I MIGHT HAVE TO AD LIB. FIRST I WOULD LIKE TO THANK DR. HIRSCHFELD FOR INVITING ME TO THIS WORKSHOP. I MUST ADMIT I DIDN'T KNOW MUCH ABOUT THE NATIONAL CHILDREN'S STUDY PRIOR TO THAT CONTACT SO I'M COMING IN AT A DISADVANTAGED ALL OF YOU. I CAN SAY THAT I STARTED READING THE DOCUMENTS WE WERE SENT LAST WEEK. WHAT A FASCINATING PROJECT THIS IS. I CAN'T WAIT TO SEE HOW IT TURNS OUT. I COME FROM THE WORLD OF RANDOMIZED CONTROLLED CLINICAL TRIALS, PRINCIPALLY BY COMMERCIAL TO MARKET NEW DRUGS AN DEVICES IN THE WORLD OF HEALTHCARE. THESE ARE CONSTRAINED SHORT DURATION THINGS THAT TAKE PLACE AND FROM WEEKS IN SOME CASES TO MAYBE ONE OR MORE YEARS IN THE PHASE 3 STUDIES. AND SOMETHING LIKE THIS THAT STRETCHES OVER DECADES REALLY MAKES YOU LOOK AT THINGS TOTALLY DIFFERENTLY BECAUSE EVERYTHING CHANGES OVER THE COURSE OF DECADES. REALLY INTERESTED TO SEE HOW THIS WORKS. I'M NOT GOING TO TALK AB ANYTHING DIRECTLY RELATED TO THE PROJECT. I'M GOING WITH DR. HISH FELL'S SUGGESTION TO TALK ABOUT A FEW CONCEPTS AND THINGS THAT WE HAVE DONE IN MY ORGANIZATION OR WE HAVE BEEN EXPOSED TO. SOME OF THOSE ARE ACTUALLY GOING TO BE REDUNDANT, TRY TO SKIP OVER THOSE THAT I CAN SEE BUT SOME MAY HAVE AN INFLUENCE ON WHAT YOU DO YOU D&O IN THE FUTURE. TO RAISE SOME NEW IDEAS, ONE OF THE PREVIOUS SPEAKERS MENTION THE FACT THAT YOU HADN'T LOOKED AT OTHER STANDARDS. WE ARE A STANDARDS ORGANIZATION. CLINICAL DATA INTERCHANGE STANDARDS, CONSORTIUM WE HAVE BEEN IN BUSINESS A DOZEN YEARS OR SO. WHAT WE DO IS DEFINE WHAT ARE ESSENTIALLY METADATA STANDARDS AS WELL AS A TRANSPORT STANDARD USED FOR CONDUCT AND THE ANALYSIS OF CLINICAL RESEARCH STUDY INFORMATION. WE'RE A GLOBAL ORGANIZATION, NON-PROFIT, SMALL GROUP OF PEOPLE WORK, A DOZEN OR SO BUT THE WORK IS DONE BY HUNDREDS IF NOT THOUSANDS AT THIS STAGE OF VOLUNTEERS WORKING THROUGHOUT THE INDUSTRY. MOST OF -- MORE OF A COMMERCIAL PERSPECTIVE THAN ACADEMIC RESEARCH TYPE PERSPECTIVE TO KEEP IN MIND. WE'RE DEALING PRINCIPALLY WITH PEOPLE DEVELOPING PRODUCTS TO GET THEM APPROVED THROUGH THE FDA. OUR MISSION IS BROAD BEYOND THAT, WHICH IS JUST TO IMPROVE PATIENT CARE AN SAFETY BY IMPROVING THE QUALITY OF MEDICAL RESEARCH AND WE BELIEVE THAT THE USING EFFECTIVE STANDARDS IMPROVES QUALITY BECAUSE PEOPLE ARE SPEAKING THE SAME LANGUAGE AND GAIN EFFICIENCIES WHICH LEADS TO QUALITY. SO THAT'S WHAT OUR BACKGROUND IS. YOU HAVE SEEN METADATA BEFORE. THE THREE THINGS I WANTED THE MENTION WHEN I LOOK AT METADATA, IS THIS DIVISION BETWEEN DESCRIPTIVE METADATA THINGS LIKE DEFINITIONS VERSUS STRUCTURAL METADATA WHICH IS A DATA DICTIONARY STUFF THAT TELLS YOU DATA TYPES, THINGS LIKE THAT. THIS ADMINISTRATIVE STUFF WHICH SOUNDS LIKE OPERATIONAL ELEMENTS WHICH -- IN CONTEXT WITH THAT. THIS IS AN OLD SLIDE DR. HIRSCHFELD WILL REMEMBER FROM DAVE CHRISTIANSON WHO WROTE OUR FIRST DOCUMENT WAS ABOUT METADATA 12 YEARS AGO. AND WHAT YOU'RE SEEING HERE ARE TWO SETS OF DATA THAT HAS LOOK CONSISTENT, HAVE THE SAME HEADINGS, DIFFERENT DATE FORMATS BUT WE HAVE NUMBER RICK DATA, D IS DISTANCE ASHES, S MIGHT BE SPEED AND F IS FORCE. HOW IMPORTANT IS METADATA? (INAUDIBLE) TWO SETS OF RESEARCHERS GOT THEIR UNITS WRONG. THIS IS THE SIMPLE STAGE OF WHAT METADATA IS. OLD SLIDE BUT I LIKE TO BRING IT UP AGAIN, TO STRESS HOW IMPORTANT TO GET THIS STUFF RIGHT TO MEANINGFULLY UNDERSTAND WHAT YOU'RE DOING WITH DATA. WE USE METADATA IN DIFFERENT WAYS. WE HAVE A SERIES OF PRODUCTS THAT WE HAVE, WHICH ARE TARGETED TO DIFFERENT STAGES OF THE RESEARCH PROCESS. DR. HIRSCHFELD WAS INVOLVED WITH A GROUP CALLED ADAM FOR ANALYSIS DATA MODEL GROUP. THESE ARE PEOPLE WHO LOOK AT HOW WE DO -- CONDUCT STANDARD ANALYSES ON CLINICAL TRIAL RESULTS WHICH IS SOMETHING THAT APPLIES TO YOU AS WELL. WITHIN THE ADAM GROUP THEY STARTED WITH A FOCUS OF METADATA BUT HAD MULTIPLE TYPES. WITHIN THE WORLD OF ANALYSIS YOU NEED TO DESCRIBE THE CONTENTS OF THE ACTUAL EACH UNIQUE DATA STORE OR SET OR TABLE YOU'RE USING FOR THE ANALYSIS, WHERE IT CAME FROM, WHERE YOU GOT IT, WHERE YOU CAN GET IT. WHAT IT'S USED FOR. YOU HAVE THE ACTUAL VARIABLE, DATA DICTIONARY STUFF WITH LABELS, TYPES AN FIELD NAMES AS WELL AS SOMETHING WE CARRY WHICH IS A FIELD CALLED ORIGIN PART OF THE (INAUDIBLE) TYPE THING. WHERE DID THE DATA COME FROM. YOU WANT TO TRACE THE CHAIN OF CUSTODY WHERE IT CAME FROM, THE DATA POINT OF ANALYSIS AND ALONG THE WAY. THEN WE HAVE ANALYSIS RESULTS METADATA THAT DESCRIBES HOW YOU CONDUCTED THE ANALYSIS THE INTENTION WAS INFORMATION GIVEN TO FDA REVIEWERS SO THEY KNEW WHAT YOU WERE DOING, INSTEAD OF PILES OF DATA SETS AND PROGRAMS THEY NEED THIS METADATA TO EXPLAIN. SO THE ANALYSIS TELLS YOU -- OBJECTIVES YOU'RE TRYING TO ACHIEVE, WHAT PROGRAMS WERE INVOLVED, WHAT FORMATS AS WELL AS MAYBE CHARACTERIZING WHERE THE OBJECTIVES MET TO DO WHAT IT WAS INTENDED TO DO. SO ALL THIS CONTEXTUAL INFORMATION WHICH IS SO IMPORTANT WITH METADATA. WE DO ACROSS ALL THESE TYPES OF METADATA TALK ABOUT ALL THE INDIVIDUAL OBJECTS INVOLVED IN THE STATISTICAL ANALYSIS INCLUDING THE METHODS EMPLOYED AND TRANSFORMATIONS THAT ARE EMPLOYED AS YOU CHANGE DATA FROM A RAW FORMAT TO SOMETHING MORE SUITABLE FOR ANALYSIS. ASSUMPTIONS, VERY IMPORTANT IN OUR WORLD THE STATISTICAL ASSUMPTIONS THAT ARE MADE WHICH MIGHT BE WHAT HAPPENS WHEN YOU MISSED A VISIT. YOU MIGHT TAKE THE LAST OBSERVATION FORWARD P OR MIGHT AVERAGE THE LAST OR NEXT ONE OR DO SOMETHING LIKE THAT. THEN ALL DERIVATIONS AND IMPEWATIONS PERFORMED. WHAT DO YOU DO WITH THE MISSING DATE, YOU MIGHT IMPUTE A DATE, IF SO HOW AND WHAT RATIONALE. ALL THESE ARE ELEMENTS WE TRY TO REPRESENT, WE DON'T ACTUALLY TELL YOU WHAT IT IS, WE JUST GIVE YOU THE TOOLS TO REPRESENT IN A COMMON WAY, THIS MIGHT BE USEFUL AS YOU GET INTO THE ANALYTICAL PHASES OF YOUR PROJECT. (INDISCERNIBLE) WAS MENTIONED PREVIOUSLY BEFORE, THAT'S ANOTHER THING WE FIND INTERESTING. IN THE WORLD OF CLINICAL TRIALS, BECAUSE THE WAY YOU COLLECT IS NOT THE WAY YOU ANALYZE. WE HAVE A SPECIAL MODEL TO MAKE IT SPECIFICALLY TUNE TO HOW FDA WANTS TO LOOK AT IT. WE NEED TO TRACK THESE INTERPRETATIONS AN ANALYSIS RESULTS METADATA WAS ONE EXAMPLE HOW THAT'S DONE BUT IT'S DONE FOR THE RAW DATA AS WELL. YOU SHOULD BE ABLE TO TRACE BACK TO THE POINT OF ORIGIN AND RECOGNIZE HOW THE DATA GOT TO WHERE IT WAS. WE HAVE A TRANSPORT STANDARD CALLED ODM, OPERATION DATA MODEL WHICH SOUNDS LIKE SOMETHING RELEVANT TO THIS PROJECT. WHAT I'M SHOWING HERE IS A USE CASE THAT WAS DETERMINED, ONE FIRST USE CASES FOR ODM. THERE WERE SEVERAL, IN THIS PARTICULAR ONE IT WAS A CASE WHERE THEY SPONSOR THE CLINICAL DATA COLLECTION PORTION OF THE PROTOCOL IN AN XML FORMAT. SEND THAT FORMAT OUT TO A PARTICULAR SITE OR USUALLY TO A INVENTORYTOR OR CRO WHO APPLY THIS ORGANIZATION IN THEIR OWN SYSTEM AND SEND THE DATA BACK TO MAKE SURE IT'S CONFORMED, DOING THE SAME THINGS YOU ARE DOING. SO THIS -- THERE ARE THINGS ABOUT THE ODM XML TRANSPORT THAT MIGHT BE USEFUL TO YOU IN THE FUTURE TO LOOK AT SINCE YOU'RE NOT REALLY TIED TO A SPECIFIC FORMAT. THIS IS OUT FOR ABOUT 10, 11 YEARS NOW, VERSION 1.3, AN XML SCHEMA. WHAT IT EXPRESSES IS THE METADATA ASSOCIATED WITH THE CONDUCT OF THE TRIAL, THE DATA COLLECTED, THE AUDIT TRAIL THAT SHOWS HOW THE DATA CHANGED OVER TIME AND IT HAS TOOLS TO REPRESENT TRACEN 'T AND TRANSFORMATION ALO -- TRACEABILITY AND TRAS FORMATION ALONG -- TRANSFORMATION ALONG THE WAY. IT'S A HIERARCHICAL SCHEMA COMPOSED OF ELEMENTS HERE. THE TOP LEVEL IS THE STUDY, VANGUARD IN YOUR CASE IS AN EXAMPLE BUT WITHIN EACH STUDY IT'S BASICALLY DESCRIBED AS A SERIES OF METADATA VERSIONS TO DESCRIBE EACH PUBLISHED VERSION ACTUALLY USED AND I'LL EXPLAIN THAT MORE. PEOPLE INVOLVED IN THE STUDY, INVESTIGATORS STUDY COORDINATOR, REFERENCE DATA ARE CODE LISTS TO LOOK UP DATA ON THE ACTUAL CLINICAL DATA ITSELF. ASSOCIATIONS HELP YOU TIE THINGS TOGETHER. AT THE ODM LEVEL WE HAVE HELP KEEP TRACK OF WHAT YOU'RE TRANSMITTING DURING THE COURSE OF THE TRIAL. WE HAVE GRANULARITY, WAYS TO REPRESENT AN ODM FILE AS FOR EXAMPLE A SNAP SHOT OR TOTAL ARCHIVAL COPY WITH EVERYTHING YOU NEED. SO YOU CAN TAYLOR HOW YOU REPRESENT ODM BUT YOU TIE IT TO VERSIONS, NAME SPACES AND OTHER USEFUL THINGS TO HELP YOU. WE HAVE OUR ODM VERSION AND WE ALSO TIE IT TO DIFFERENT METADATA VERSIONS SO ROOK AT EACH INDIVIDUAL METH INSTANCE WITHIN A FILE YOU SET AN INDIVIDUAL VERSIONS TO REPRESENT VARIOUS POINTS IN TIME THAT JAY TALKED ABOUT IN THE COURSE OF THE STUDY. I BREAKS DOWN METADATA INTO OVERALL STUDY. EVENTS ARE TREATED SORT OF DATA COLLECTION EVENT, VISITS IN OUR WORLD, THEY CAN BE POINTS IN TIME AT YOUR WORLD, REPEATED THINGS THAT HAPPEN AS YOU NEED BUT BASICALLY A COLLECTION OF DATA ENTRY FORMS FILLED OUT AT ONE POINT IN TIME. FORMS, FIT WITHIN THESE EVENTS. ITEM GROUPS, SORT OF PANELS OF INFORMATION, THAT FIT IN A FORM AND THE INDIVIDUAL ITEMS SO THIS IS THE MAIN STRUCTURAL FEATURES OF THE ODM METADATA, LOOK DOWN A LITTLE DEEPER IN THE ODM ITEM DESK, ONE THING AB ODM IS AS IT IS XML IT DEFINES WAYS TO EXTEND THESE CONCEPTS WITH PARTICULAR THINGS YOU MAY NEED FOR YOUR OWN INDIVIDUAL STUDY. THERE'S A METHOD FOR DOING THAT. AT THE ITEM LEVEL WE COVER THE SAME STUFF YOU TALKED ABOUT EARLIER, DATA TYPE LINKS. WE ALSO HAVE A WAY TO REPRESENT THINGS LIKE RANGE CHECKS AN EDIT CHECKS, SIMPLE CODE ED IT CHECKS AND THE QUESTION, IT'S AN EXTERNAL QUESTION, THERE MIGHT BE DIFFERENT WAYS OF REPRESENTING A QUESTION DEPENDING ON MODALITY OR P UPON THE AUDIENCE IN SOME CASES, MIGHT ASK THE QUESTION DIFFERENTLY. ONE OF THE CLINICAL DATA FOLLOWS THESYME HIERARCHICAL STRUCTURE WHERE YOU HAVE THE EVENTS AND THE FORMS AND INDIVIDUAL DATA TYPE. WHAT'S IMPORTANT AT THIS LEVEL IS EVERY LAYER WE HAVE AN AUDIT RECORD THAT CAN BE USED TO REPRESENT CHANGE THAT HAPPENED TO THE DATA AS WELL AS THE ANNOTATIONS, TIES TO THIS ARCHIVE LAY OUT COULD BE SOMETHING TO TIE BACK TO THESE VARIOUS INSTANCES, VERSIONS OF THE STRUMS THAT YOU HAD FOR EXAMPLE. SO IT HAS LOTS OF FEATURE, TO KEEP TRACK OF THINGS. I DON'T KNOW OF ANYONE THAT USED ODM FOR A LONGITUDINAL STUDY OF THIS LENGTH BUT THIS COULD BE A GOOD EXAMPLE OF A PLACE THAT WOULD BE FUN TO TRY IT. SO THAT'S WHERE WE GO WITH ODM XML REPRESENTATION. >> IF WE WENT BACK TO THE METADATA LEVEL, I DIDN'T SHOW YOU EVERYTHING BUT THERE ARE REPRESENTATIONS WITHIN THE PROTOCOL FOR INTERESTING OTHER FEATURES SUCH ADS HIGH LEVEL DESCRIPTIONS OF EACH INDIVIDUAL STUDY, IN OUR WORLD THIS WOULD BE THINGS LIKE THE PHASE AND THE NUMBER OF SITES AND THINGS LIKE THAT BUT THAT MIGHT AFFECT YOUR STUB STUDIES AS WELL AS WELL AS REPRESENTING THE TABLE OF TIME AND EVENTS AND REPRESENTING THE ACTUAL TREATMENT PLANS THAT ARE DONE, PROBABLY NOT AS RELEVANT TO YOU BUT THOSE THINGS ARE ACTUALLY WITHIN THE MODEL. SO IT'S A POWERFUL MODEL ACTUALLY, IT HAS A GOOD TRACK RECORD OF USE WITHIN THE PHARMACEUTICAL INDUSTRY AT LEAST. METADATA IS VERY IMPORTANT TO THE WORLD OF CDISC. DEFINING METADATA STANDARDS PEOPLE APPLY ON CRIMINAL STUDIES AND WE RECOGNIZE THIS HAS GOT TO BE A COMPLICATED TASK, MOST METADATA IS PUBLISHED AS PDF DOCUMENTS OR SPREAD SHEETS, NOT THE EASIEST WAY TO WORK SO WE RECOGNIZE A NEED TO ESTABLISH A REPOSITORY SO WE BEGAN A PROJECT CALLED SHARE WHICH IS MEANT TO CAPTURE ALL THE ESSENTIAL CONCEPTS AND RELATIONSHIPS AND TERMINOLOGY THAT WE USE IN CLINICAL RESEARCH MUCH OF THE INFORMATION WE USE IN THIS WORLD AND THE HEALTHCARE WORLD AS WELL, A FACTOR WE'RE WRESTLING WITH, THE PROGRESS GETS BIGGER BECAUSE AT SOME CORE WE HAVE THE SAME PROBLEM. AND THE SHARE PROJECT IS -- THE GOAL IS TO ESTABLISH A METADATA REPOSITORY THAT WE CAN USE THAT WILL BE USED AS A GOLD STANDARD OF REFERENCE FOR EVERYONE CONDUCTING CLINICAL RESEARCH WITHIN A RANDOMIZED CONTROL TRIAL COMMERCIAL POINT OF VIEW BUT HOPEFULLY BEYOND THAT, THIS DOESN'T EXIST, IT EXISTS AS A REPRESENTATION OF CONTENT BEING DEVELOPED AS WELL AS A A A SET OF REQUIREMENTS, AS A NON-PROFIT WE DON'T HAVE AN IMPLEMENTATION MODEL, WE HAVE AT ONE POINT PARTNERING WITH NCIEDS BUT NOT AS -- IT HAS LIMITATIONS AS WELL AS NCI ISN'T THE MOST STABLE, WHETHER IT EXISTS. ONE THING THAT WE'RE INTERESTED IN IS ARE THERE OPPORTUNITIES FOR US TO PARTNER WITH PEOPLE WITH SIMILAR NEEDS IN ORDER TO BUILD A BROADER SCOPE METADATA REPOSITORY TO CONSERVE THE BROADER WORLD OF HEALTHCARE. WHAT'S IN OUR REQUIREMENTS FOR OUR REPOSITORY, THE BASIC METADATA WE HAVE, I DIDN'T MENTION THE TWO MODELS THAT ARE LISTED HERE, SDTM IS A WAY, IT'S A WAY OF REPRESENTING CLINICAL OBSERVATIONS ORGANIZED BY DOMAIN OR TOPIC. THAT'S SOMETHING WE'RE DEVELOPED, FDA LIKES DOMAIN LISTINGS, ALL THE LAB DATA IN ONE PLACE. THE MODEL OF REPRESENTING THAT AND CDASH IS A MODEL FOR STANDARDIZED CASE REPORT FORM CONTENT. THE INSTRUMENTS THAT YOU WOULD USE. SO THAT'S ONE BASIC THING WE NEED TO PUT IN THE REPOSITORY. ALSO NEED TO DO TERMINOLOGY BINDING TO SHOW WHEN YOU HAVE PARTICULAR DATA ELEMENTS REPRESENTED ON A DATA ENTRY FORM WHICH VOCABULARIES THEY LINK TO WITHIN OUR WORLD WE USE THE META DICTIONARY PRIMARILY AND A NUMBER OF OTHERS BUT SNOMED IS OUT THERE, AND REALLY I'LL MENTION SNOMED LATER. IMPLEMENTATION INSTRUCTIONS AN BASICALLY TELL YOU HOW TO USE THIS, TELL SITES HOW TO USE THIS TYPE OF INFORMATION. THERE ARE DIFFERENT WAYS OF REPRESENTING METADATA. WE'LL SEE SLIDES THAT SHOWED EARLIER. THIS IS OUR SDT WAY TO DISPLAY BLOOD PRESSURE TO SPLIT THINGS INTO 8 CHARACTER NAME COLUMNS AND A VAST DATA SET. THIS IS A REPRESENTATION OF BLOOD PRESSURE BY INNER MOUNTAIN HEALTHCARE. THEY DEVELOPED A MELT DATA REPRESENTATION CALLED CEN, THE CLINICAL ELEMENT MODEL. THEY HAVE THE SAME INFORMATION SPLIT IN SIMILAR WAYS HERE. I THINK LATER WE HAVE THE OPEN AIR ONE THAT JAY WAS SHOWING EARLIER. WHICH IS YET ANOTHER REPRESENTATION. INTERRING THINGS -- INTERESTING THINGS. EITHER SYSTOLIC OR DIASTOLIC OR ONE WAY OR ANOTHER. WE MENTIONED THAT EARLIER. DON'T NEED TO COVER THAT. WB THE WORLD OF SHARE WE HAD ADDITIONAL THINGS TO REPRESENT. ONE ELEMENT, WE HAVE A REFERENCE MODEL WE CALL BRIDGE, DOMAIN ANALYSIS MODEL, MODELS ARE DIFFERENT FROM METADATA AND MODELS -- MAIN ANALYSIS MODELS ARE REPRESENTING THE WORLD WE OPERATE WITHIN AND BRIDGE IS THE ONE WE HAVE DEVELOPED WHICH IS MAPPED OR HARMONIZED WITH THE HL-7 REFERENCE INFORMATION MODEL BUT BASICALLY THEN DETERMINED TO BE MADE -- MAKE SENSE TO THE WORLD OF CLINICAL RESEARCHERS WITH WHICH THE RN BY NO MEANS CAN ACHIEVE. WE HAVE USED ISO-21090 DATA TYPES. COMPLEX DATA TYPES F FOR REPRESENTING INFORMATION, NOT JUST NUMERIC IN CHARACTER BUT DEAL WITH MORE COMPLEX PACKETS OF INFORMATION. THE FISCAL ADDRESS OF ANY OBJECT. THERE'S FISCAL QUANTITY DATA TYPE CONSISTS OF A NEW VALUE AS WELL AS ASSOCIATED UNITS AND REFERENCE UPPER AND NORMAL RANGE AND OTHER FACTORS TO HELP YOU DO THAT. THEY ARE CODED ELEMENT DATA TYPES THAT REPRESENT WITHIN A SINGLE DATA CONCEPT YOU WOULD HAVE THE VERBATIM TERM AS WELL AS CODED TERM AND THE CODING SYSTEM AN CODED VERSION AND ACTUAL CODE. SO BASICALLY A WHOLE PACKET OF INFORMATION. THIS STUFF DOESN'T REALLY -- NEEDS TO EXPAND OUT IN THE DATABASE TO QUERY IT. BUT THE EFFICIENT WAY OF DOING MODELING AND REPRESENTING IT CONSISTENTLY METADATA ABOUT SPECIFIC DATA ELEMENTS. WE ALSO HAD THE WORLD OF SDTM VARIABLE AND CDASH, THE FDA DOMAIN AND CDASH IS THE DATA COLLECTION FORMS ASSOCIATED WITH IT AND THE CONTROL TERMINOLOGY. WE HAVE THIS OWN WORLD WHERE WE LOOK AT BLOOD PRESSURE AS BASICALLY BROKEN UP INTO A DEFINED OBSERVATION WHICH IS THE PROTOCOL SPECIFIED WAY YOU REPRESENT THE INFORMATION. HERE IS METADATA AND THEN PERFORMED ADULTS WHICH IS THE CHECKED DATA ELEMENTS ASSOCIATESSED WITH THAT CONCEPT. AND THESE THINGS ARE REPUTABLE. YOU COULD HAVE MAYBE ATTRIBUTES ASSOCIATED WITH THE ENTIRE AB SERVATION -- OBSERVATION AS A WHOLE WHICH INCLUDES BODY POSITION AND LOCATION, WHAT TYPE OF THICK WAS USED AND REPEATED MEASUREMENTS. OUR METADATA MODEL WE REPRESENT ALL THESE DISTINCT TYPES OF INFORMATION, HOW THEY CONNECT TO EXCESS VOCABULARIES AND HOW THEY'RE RELATED TO ONE ANOTHER. HIRE ARCICALLY OR HIRE ARCTICALLY. WHAT A PIECE O OUR BRIDGE MODEL LOOKS LIKE BURIED WITHIN HERE IS A PROTOCOL DEFINITION. THERE ARE PERFORMED AND ACTUAL PLANNED AND PERFORMED OBSERVATIONS, JUST A WAY OF REPRESENTING INFORMATION BUT WE USE THESE 21090 DATA TYPES, WE ALIGN WITH THE HL-7 RIN. SEMANTIC WEB. I THINK IT'S ABSOLUTELY THE BEST WAY TO REPRESENT RELATIONSHIPS BETWEEN METADATA. WE HAVEN'T FIGURED HOW TO DO THAT YET BUT WE ALREADY COVERED THAT POINT. ESPECIALLY WHEN YOU ADD REPRESENTATIVE ONTOLOGIES TO PUT THEM IN PLACE. LAST THING I WANT TO COVER WHICH MIGHT BE RELEVANT IS AN ACTIVITY I HAVE BEEN INVOLVED IN RECENTLY T. CLINICAL INFORMATION MODELING INITIATIVE. THIS WAS STARTED OUT ORTHOGONAL TO HL-7. IT HAS PROBLEMS WITH ADOPTION OF E-3. THE PROBLEM IS WAY TOO CUMBERSOME AN COMPLEX AND PEOPLE FOUND DIFFICULT TO USING. SO HL-7, THE HELP LEVEL 7 STANDARDS ORGANIZATION USED TO -- WITHIN THE PRIMARY WORLD OF HEALTHCARE PROVIDERS AND PAYERS. THEY STARTED TO BOARD INITIATED ACTIVITY CALLED (INAUDIBLE) IF YOU START OVER WHAT YOU WILL DO. FRESH LOOK SAID KEEP THE WOMEN BUT YOU NEED A DIFFERENCE WAY TO USE IT. SO YOU DON'T USE XML BECAUSE THAT WILL CHANGE DRAMATICALLY IN THE NEXT YEAR OR TWO. CLINICAL MODELING INFORMATION SPUN FROM THAT. STAN HUFF CHAIRED FROM THE INNER MOUNTAINS CHAIRED FRESH LOOK TASK FORCE. OUTSIDE OF HL-7, THERE ARE MULTIPLE MODELS IN EXISTENCE TO RESPECT CLINICAL INFORMATION. ONE PROBLEM WE HAVE IS TO BE ABLE TO PROVIDE CONSISTENCY BETWEEN THESE MODELS. SO THAT'S WHERE THE ACTIVITY STARTED. WHAT THEY'RE TRYING TO DO IS BASICALLY IMPROVE INTEROPERABILITY BY HAVING THESE SHARED AND ISOSEMANTICALLY EQUIVALENT CLINICAL INFORMATION MODELS. THEY ARE IN THE REPOSITORY. ANOTHER POTENTIAL ONE WHO NEED TO COLLECT METADATA AND HOW IT'S REPRESENTED IN MODELS. THEY BASICALLY DECIDED TO CREATE MODELS THAT ARE GOING TO INTERON OPERATE WITH OTHER MODELS, THEY HAVE ADOPTED ADL WHICH IS THE LANGUAGE USED BY THE OPEN AIR SYSTEM, WE HAD THAT -- THE OPEN AIR -- OPEN SOURCE EHR SYSTEM CALLED ARCHETYPE DEFINITION LANGUAGE. THAT'S WHAT THEY'RE CHOSING PRINCIPLE FOR AS WELL AS UML. AND THEY FOUNDATION IS SNOMED. THEY BUILD AROUND IT AND EQUIVALENCIES TO THAT SO PEOPLE CAN TRANSFORM FROM ONE TO THE OTHER. THAT'S THE GOAL THE INNER MOUNTAIN HEALTHCARE SAYS, SHARE APPLICATIONS REPORTS, ALERTS PROTOCOLS WITH ANYONE. THIS IS A GREAT SLIDE. IF YOU LOOK OVER HERE, LOOK VERY CLOSELY YOU SEE A LITTLE FOOT HERE, HERE IS NEONATAL PATIENT. WHO IS BORN UNDER IN EARLY NATAL CARE UNIT WITH GO ZILLIONS OF INSTRUMENTS AND COMPELLING, THEY WANT TO PUT ALL THIS INFORMATION AND PRESENT IT TO A CLINICIAN WITH ALL THINGS THEY HAVE IN ONE LOGICAL PLACE WHICH IS A VERY DAUNTING TASK BECAUSE ALL THIS STUFF IS GENERATING ELECTRONIC DATA IN DIFFERENT FORMATS AND PUMPING IT OUT. SO THIS IS A VERY COMPELLING IMAGE I BORROWED FROM HIM. THIS IS ANOTHER SCARY ONE, A SENSE OF WHEN WE'RE TALKING MODELS, THERE IS ONE MODEL THAT SHOWS UP OVER HERE. THE -- ONE GOOD THING ABOUT THE RIM IS MY FRIEND CHARLIE WOULD SAY IT'S ABSTRACT ENOUGH TO REPRESENT ANYTHING TO ANYONE AND MEAN NOTHING. SO ABSTRACT. AND BUT THAT'S WHAT IT DOES, IT REPRESENTS THINGS IN TERMS OF ENTITIES AND ACTORS AND PARTICIPATIONS AND THINGS LIKE THAT BUT DOESN'T HAVE ANY INHERENT CLINICAL SEMANTICS WITHIN IT. THEY HAVE TO BE APPLIED THROUGH TERMINOLOGIES AND MODELS SO TESTIMONY RIM REPRESENTS ONE MODEL BUT THERE ARE OTHER MODELS TOO, THERE ARE IS CEMs THE INNER MOUNTAIN DCMs ARE THE SAME PROBLEMS IN EUROPE, DETAILED CLINICAL MODELS, THAT CONCEPT IS ALSO USED WITHIN HL-7, IT HAS THESE THINGS CALLED CLINICAL DOCUMENT ARCHITECTURE TEMPLATES WHICH REPRESENT, BLOOD PRESSURE, CERTAIN GROUPS OF CONCEPTS, OPEN AIR HAS ARCHETYPES AND HAS A EUROPEAN STANDARDS ORGANIZATION TO REPRESENT AR CI TYPES. THESE ARE ALL HL-7 SPEAK. HAVING A SINGLE FORMALISM TO REPRESENT SO THEY CAN BE TRANSLATED INTO ANY OF THESE MULTIPLE FORMATS. ACTUALLY UML IS ONE OF THOSE WAYS OF EXPRESSING IT. SEMANTIC WEB IS -- THAT'S ANOTHER ONE PEOPLE ARE LOOKING AT RDF. SO TWO THINGS THAT CAME OUT THAT I FOUND INTERESTING THAT I HAVEN'T HEARD YOU GUYS TALK BUT KEEP IN THE BACK OF YOUR MIND, ONE OF THE ISSUE OF DECOMPOSITION MAPPING. WHEN YOU REPRESENT DATA THE WAY YOU REPRESENT IT FOR PEOPLE COLLECTING DATA LIKELY IS DIFFERENT FROM THE WAY YOU ANALYZE IT. WE HAVE TO SPLIT THINGS UP. PRE-COORDINATED MOALTD, YOU MAY ACTUALLY GROUP MANY DATA ATTRIBUTING TO. MIGHT BE A SIMPLEO7b ON A DATA INDUSTRY FORM, THE PROTOCOL MIGHT SPECIFY COLLECTED ON THE RIGHT ARM IN A SITTING POSITION SYSTOLIC BLOOD PRESSURE U YOU DON'T NEED TO CHECK IT, THE ONLY THING YOU NEED TO COLLECT IS THIS. VERY QUICK FOR THE PEOPLE ENTERING THE DATA AS DISTINGUISHED WITH THE POST MODEL HOW TO EXPAND AND BLOW IT ONE A NEAR DATABASE TO SERGE THESE THINGS AND MAKE SURE YOU'RE LOOKING AT THINGS TRULY COMPARABLE THAT YOU'RE ONLY LOOKING AT SITTING BLOOD SPRETS SURE WHEN YOU DO THEM SO YOU'RE GOING TO SPLIT THESE UP. SO YOU SHOULD KEEP THOSE TWO THINGS IN MIND AS DEFINING INSTRUMENTS TO BE ABLE TO WHAT YOU ACTUALLY -- THE QUESTION YOU ASK MAY NOT BE THE WAY YOU REPRESENT THE DATABASE. NOTHING PARTICULARLY NEW ABOUT THAT. SO THERE WE ARE WITH SYSTOLIC BLOOD PRESSURE, ONE ATTRIBUTE BUT BREAKS INTO DIFFERENT DATA FIELDS. HERE IS ANOTHER ONE, DATA ENTRY STYLES, DIFFERENCE, THIS IS A COMMON THING, DIFFERENTIATION USED IN STATISTICAL ANALYSIS. FIRST CASE WE HAVE A ONLINE FORUM WITH A PRE-DEFINED SET OF VALUES, HAIR COLOR, THREE CHOICES, SECOND CHOICE YOU HAVE A PULL DOWN LIST, YOU PICK THE SAME ONE, AND THAT IS CALLED -- THE OTHER OPTION IS TO HAVE THIS FINDING WHERE IT BASICALLY SAYS TELL ME WHAT YOU SAW AND THROW IT IN. ONE OF THE TOP IS CALLED THIS OHIO CALLED EVALUATION STYLE AND BOTTOM IS INSERTION STYLE. EVALUATION IS SOMETHING THAT BASICALLY YOU'RE GIVING A CONSTRAINED SET WHERE INSERTION SOMEBODY OBSERVES SOMETHING AND PUTS ITS IN THERE. THAT'S ANOTHER METADATA YOU MIGHT KEEP TRACK OF. IT AFFECTS HOW TO INTERPRET ANALYSIS BASED ON WHETHER SOMEONE WAS PROMPTED TO ANSWER ONE OF THE OLD CHOICES OR BASICALLY HAD AN OPEN HE WANTED QUESTION. THAT THEY CAN USE. OKAY. SO I PULLED THIS FROM THE SET OF SLIDES T THREE METADATA CHALLENGES AND I'M NOT AN AN EXPERT ON EVERYTHING IN STANDARDS BUT TALK ABOUT SOME OF THOSE THINGS WE HAVE AND WITH EACH OF THESE I SAW SOME RELEVANCE TO CDISC IN ODM AND ATOM, OUR XML MODEL AND HOW WE LOOK AT ANALYSIS DATA. A SHARE, WHICH IS OUR GOAL FOR ESTABLISHING A METADATA REPOSITORY AND MAYBE THERE'S WAY TO PARTNER AND SHARE, AT LEAST LEARN FROM EACH OTHER AND BRIDGE, DON'T SEE, I HAVEN'T SEEN WHETHER OR NOT THERE IS DOMAIN ANALYSIS MODEL TO REPRESENT THIS WORLD OF THIS TYPE OF STUDY, IT WOULD BE GOOD IF THERE WAS, MAYBE IT BELOCKS IN OUR MODEL TOO BECAUSE WE HAVEN'T DONE A LOT OF THESE STUDIES AND IT'S MOST OF THE SAME DATA COLLECTION POINTS, QUESTIONNAIRES, ACTUALLY REPRESENTED ALREADY IN THE BRIDGE. SO AN INTERESTING THING TO LOOK AT. THAT'S BASICALLY ALL I HAD. THANK YOU VERY MUCH. QUESTIONS. DR. HIRSCHFELD. >> WHY IS IT STUDIES ARE SHORT TERM WHEN EXPOSURES ARE FOR PEOPLE WHO WILL PROBABLY BE TAKING CHRONIC MEDICATIONS? AND YOU WOULD ANTICIPATE THERE WOULD BE LONG-TERM FOLLOW THROUGH. >> THERE OFTEN ARE LONG TERM FOLLOW-UP STUDIES THAT ARE DONE. IT'S -- THAT WASN'T THE PERSPECTIVE WE TOOK WHEN WE DEVELOPED THE STANDARD. PEOPLE ARE VOLUNTEERS COMING FROM PHARMACEUTICAL ORGANIZATIONS. PRINCIPALLY ON THE CRITICAL PATH TO APPROVAL BECAUSE THAT'S THE POLITICAL MILLION DOLLARS A DAY THEY'RE NOT IN THE MARKET SO ABSOLUTELY. I THINK THOSE PEOPLE HAVE MAPPED LONG TERM STUDIES UNDER TWO STTM, THAICH QLIEWSED ODM TO REPRESENT THAT INFORMATION. THEY WEREN'T REALLY FOREMOST IN THE REQUIREMENTS. >> UNDERSTOOD. I WAS WANTED TO ASK A QUESTION AND MAKE THE POINT THAT THERE IS A CONTINUUM. IT'S NOT REALLY TWO DIFFERENT WORLDS. CONCEPTUALLY. >> WE ARE LEARNING THAT MORE EACH DAY. PEOPLE THINK THEY'RE LIVE WNG THEIR OWN DOMAIN. I HAVE BEEN WORKING WITH HL-7 OVER A DECADE. AND THERE ARE A NUMBER OF THINGS, AT ROOT THE THINGS WE OBSERVE IN RESEARCH ARE PRETTY MUCH THE SAME. WE JUST REPRESENT THEM IN DIFFERENT WAYS. IT WILL TAKE US A WHILE BUT AS STANDARDS ADVOCATE I FIRMLY BELIEVE WE'RE GOING TO GET THESE THINGS TOGETHER WHICH IS WHY I LOVE RECOGNIZING THAT WE'RE MATURING. WE'RE GETTING THERE BUT WE GOT A WAYS TO GO. >> I HAD A QUESTION. YOU LIVED IN DIFFERENT MODELS, OBVIOUSLY THEY'RE QUITE DISPARATE BUT IS THERE SOMETHING YOU FIND IN TERMS OF THEMES TO IMPROVE THE USABILITY OR IMPAIRED FEASIBILITY OF THE MODEL? >> DEPENDS ON WHAT YOU'RE DOING WITH THE MODEL. THE WAY THE ARCHETYPES AND DCMs MAKE SENSE TO A CLINICIAN BECAUSE OF THE WAY THEY ARE, THE REM MAKES NO SENSE TO A CLINICIAN. THE RIM WAS DESIGNED TO BE ABLE TO TAKE ANY TYPE OF INFORMATION AT SOME LEVEL AND REPRESENT IT IN A CERTAIN WAY. THAT'S WHY THE LAST TWO SLIDES I SHOWED IN TERMS OF HOW YOU REPRESENT INFORMATION FOR THE END USER COMMUNITY CHECKING DATA IS LIKELY DIFFERENT FROM HOW YOU REP YOUR DATABASE AND THAT IS AN IMPORTANT CONCEPT TO KEEP THOSE ASPECTS OF METH DATE TO DO -- METADATA TO DO BOTH WAYS IN YOUR REPOSITORY. >> DO YOU HAVE CONDITIONS TO FORMALLY MAP (INAUDIBLE) LIKE THE (INDISCERNIBLE) CLASSIFICATION HAVE 28 VERSIONS OVER TIME BUT IF UP TO KNOW WHAT CHANGE YOU HAVE TO CONSULT THE PEDIATRIC (INAUDIBLE). (INDISCERNIBLE) >> WITHIN THE REQUIREMENTS FOR OUR SHARE REPOSITORY RERECOGNIZE THERE'S A CRITICAL NEED TO BE ABLE TO TRACE TO DO TRACEABILITY AND KEEP THE CHANGES OVER TIME. WITHIN OUR ODM MODEL WE MAKE SURE IT'S IMPORTANT TO REPRESENT EACH CHANGE AND BACKTRACK WHICH IS A VERY USEFUL THING. ONE OTHER USE CASES FOR ODM IS THE AR CHIEFL USE CASE, WHAT YOU CAN DO IS SAVE THIS XML FILE WHICH HAS ALL THE THEN CURRENT METADATA AND THE DATA ASSOCIATED WITH IT AS A SNAP SHOT AND ACTUALLY STORE THAT SOMEWHERE, MAYBE NOT DIRECTLY BUT PERHAPS A LINK FROM THE REPOSITORY TO TRACK SOME OF THESE THINGS OVER TIME. I THINK THAT'S -- WHAT'S INTERESTING IS FINDING A WAY TO REPRESENT CHANGES OVER TIME AND HOW TO LOOK AT IT FROM A USER INTERFACE. I HAVEN'T THOUGHT ABOUT THAT SO MUCH SO IT WILL BE AN INTERESTING PROBLEM TO ATTACK. YES. WITHIN OUR INDUSTRY IT'S EXTREMELY IMPORTANT TO BE ABLE TO REPRODUCE AN TRACE BACK BECAUSE OF REGULATORY OVERSIGHT. SO WE ARE VERY CAREFUL ABOUT THAT AND TRACE BACK AND LOOK AT THAT OVERALL TIME LINE. >> JUST WANTED TO NOTE THAT THERE ARE A COUPLE OF STUDY CENTERS AND THEY'RE REPRESENTED HERE, WHO USE THE CALENDAR (INAUDIBLE) SO HOWEVER WE THINK ABOUT THE PROTOCOL AND HOWEVER WE DESCRIBE AND DEFINE IT, THESE FOLKS ARE ABLE TO HARMONIZE THOSE DEFINITIONS WITH THE DEFINITION THAT COMES OUT OF BRIDGE AND THEY'RE DOING THAT. >> IT'S ACTUALLY IN BRIDGE BUT THE RISE IN THE BRIDGE, THE NCI BEING A PRINCIPLE STAKEHOLDER OF THE BRIDGE BY FIRST BRIDGE MODELING AND USING THE BRIDGE SO IT'S USEFUL. (INAUDIBLE). (OFF MIC) >> I'LL JUST NOTE THAT DURING THE TECH TRANSFER THAT'S GOING ON, AT/j¨ THE MOMENT WE ARE LOOKING TO INTEGRATE NOT JUST CDISC AND NICHD, NCS AND THE OTHER INSTITUTES BUT ALSO AMONG OTHER HIGH PROFILE PROGRAMS, THE CLINICAL TRANSLATIONAL SCIENCE AWARDS. WE HAVE HAD I BELIEVE SOME VERY PRODUCTIVE INTERACTIONS WITH THE CTSA CONSORTIUM WITH INDIVIDUALS AND AS A WHOLE. WE WANT TO CONTINUE TO BUILD ON THAT IN LARGER PICTURE WE DO BEST TO LEVERAGE TOGETHER WHAT WE HAVE RATHER THAN TRAY TO DO IT SEPARATELY. AND I THINK OUR NEXT SPEAKER IS A EXCELLENT AND TRYING TO LEVERAGE AND THINKENING TOOLS FROM ONE TO THE OTHER. >> THANK YOU DR. HIRSCHFELD. PLEASURE TO BE HERE, APPRECIATE THE INVITATION. SO I HAVE BEEN A LONG TERM MEMBER OF THE caBIG COMMUNITY AS WELL AS PART OF THE CTSA COMMUNITY. AND PART OF NCS. SO GREAT TO BE HERE TO TALK ABOUT WHAT WE'RE DOING. I AM GOING TO DIVE DOWN INTO SOME OF THE WEEDS IN TALKING ABOUT THE MDES AND WHERE THERE ARE IMPROVEMENTS TO BE MADE. I HAVE WAY TOO MANY SLIDES SO I'M GOING TO SKIP THROUGH THINGS OTHERS HAVE SPOKEN ABOUT. I WAS HOPING THERE WOULD BE -- SOMETHING DR. CHRIS FOREST BROUGHT UP WITH ME THAT RESONATED IS THE IDEA THERE'S A TR JEK TORRY AND HEALTH POTENTIAL IN EVERYONE OF US. AND SOMETHING I HAVE SEEN IN MY OWN LIFE, I HAVE FOUR KIDS, THEY'RE DIFFERENT FROM EACH OTHER AND THEY ALL REACHED DEVELOPMENTAL STAGES AT DIFFERENT CALENDAR TIMES. SO MY LITTLE GIRL, AS MOST LITTLE GIRLS IS EMOTIONALLY FAR AHEAD OF -- SHE'S FIVE, MY 15-YEAR-OLD SON. THAT'S NO QUESTION IN MY MIND SHE'S WAY OUT THERE. SO HOW DO WE START TO CAPTURE THAT DATA AND REPRESENT IT APPROPRIATELY SO WE CAN SEE HOW EACH PERSON IS PART OF THE NCS GOES THROUGH THEIR DEVELOPMENTAL STAGES. ALSO INTRIGUING TO ME IS MY GRANDFATHER WHEN HE WAS 94 AND DYING, HE MADE A COMMENT THAT HE WAS STILL 16 EMOTIONALLY INSIDE. NOT THAT HE -- HE BEHAVED LIKE A 16-YEAR-OLD BUT LEVEL OF ENTHUSIASM AND DRIVE THAT REALLY DID PERMEATE HIS LIFE. FROM A HEALTH POTENTIAL STANDPOINT HE WAS A LIFE LONG DIABETIC. BUT HE WAS TYPE 2 DIABETIC AND CONTROLLED HIS DIET AND DIDN'T NEED INTERVENTION. THAT'S A POWERFUL STATEMENT THAT INDIVIDUALS CAN CHANGE THEIR OWN HEALTH. SO HOW DO WE TAKE THAT INTO ACCOUNT. HOWDO WE CHANGE INDIVIDUALS IN THE NCS. SO SORRY, NOW BACK TO THE GEEKY SIDE OF THINGS. I DID UNDERSTAND ALL 12 TERMS SO I'M ONE OF THE PEOPLE THAT NEEDS THE 12-STEP PROCESS FROM MINNESOTA. GREAT. I APPRECIATE THAT. SO WHAT I WANT TO DO TODAY IS WALK THROUGH MDS FROM A LITTLE DIFFERENT PERSPECTIVE, SKIP THROUGH A NUMBER OF SLIDES THAT YOU HAVE SEEN, THEN TALK HOW WE HAVE BEEN DOING THAT FROM IMPLEMENTATION STANDPOINT HOW WE'RE TALKING MDS DATA AND BUILDING OUR IMS AROUND MDES. SO WE CAN SUBMIT DATA TO THE VANGUARD DATA REPOSITORY. A LITTLE THROUGH THE OPERATIONAL ELEMENTS. HOW I LIKE TO SEE THE VERSION HAPPENING, HOW IT'S ALREADY HAPPENING. THESE ARE TOPICS WE HAVE DISCUSSED TODAY. AND THEN I WANT TO TALK A LITTLE BIT ABOUT THE OPEN BIOMEDICAL ONTOLOGY, THAT'S A GROUP THAT HASN'T GOTTEN MUCH DISCUSSION YET TODAY. THEY HAVE BEEN AROUND NOW 13, 14 YEARS. IT'S A VERY DIFFERENT APPROACH TO METADATA AND ONTOLOGY. WE BENEFIT FROM SOME OF THE THOUGHTS THAT HAVE GONE IN THAT COMMUNITY. SO THIS IS SOMETHING THAT HAS ALREADY BEEN GONE THROUGH A FAIR AMOUNT. MDS IS ORGANIZING THE TABLES. THE CONSTRAINTS ARE CONTEXTUAL. THIS IS A BIG PROBLEM. LISTING OF INSTRUMENTS ARE NOT COMPLETE SO THERE ARE INDIVIDUAL ELEMENTS WHERE INSTRUMENTS SHOW UP IN ARE NOT LISTED SO THIS IS A CONSISTENCY ISSUE. I'LL GO THROUGH A COUPLE VERY SPECIFIC EXAMPLES HERE. THESE ARE THE CURRENT LISTED ATTRIBUTES. NOT GOING TO GO THROUGH WHAT ALL THESE ARE, YOU HAVE SEEN THEM IN SLIDES. THEY'RE A GREAT START. I WISH EVERY STUDY I PARTICIPATE IN WAS MULTI-SITE, HAD SOMETHING LIKE MDS. SO MY FIRST IS DONE THROW OUT THE BABY WITH THE BATH WATER. WE ALREADY IMPROVE WHAT WE HAVE. SOME THINGS LIKE THE INSTRUMENT NAME, CATEGORY, THE FORMAT CONSTRAINT COULD USE ADDITIONAL SEMANTICS SO THEY'RE A BUILT MORE COMPUTABLE. SO I TALK THE FIRST THREE ITEMS OUT OF PREGNANCY VISIT ONE. PREGNANCY VISIT ONE. YOU CAN SEE OR ACTUALLY YOU CAN'T SEE, THAT THERE ARE A NUMBER OF LABEL AND DEFINITION COMMENTS. THAT AREN'T COMPUTABLE. BUT THEY'RE VALUABLE SO SOMEBODY HAS TO LOOK AT THESE. AND THEY HAVE TO LOOK AT THEM BEFORE THEY CAN REPRESENT THEM IN THEIR LOCAL IMS. IN THE NCS IMS IS THE INFORMATION MANAGEMENT SYSTEM, NOT EVERY GROUP KNOWS THAT'S WHAT THAT MEANS. THEN THERE'S THE FORMAL CONSTRAINT WHICH AGAIN IS VERY TECH CHURL. IT HAS VERY IMPORTANT INFORMATION IN IT BUT WE CAN'T QUITE AUTOMATE IT. THEN THE INSTRUMENT NAME YOU'LL NOTICE IMBEDDED IN THE INSTRUMENT NAME WHICH RECRUITMENT STRATEGIES YOU CAN USE THIS SPECIFIC ELEMENT FOR, THAT IS GREAT DATA BUT IT'S NOT SOMETHING YOU CAN PULL OUT AND USE. SO WE NEED MORE STRUCTURE. MORE OF THE SAME. I WON'T MAKE ANY MORE COMMENTS ABOUT THIS EXCEPT TO SAY THE DATA TYPE IS VERY IMPORTANT BUT MORE DATA AROUND IT. ANOTHER COMMENT IS A CODE LIST. WE HEARD THIS MENTION AD NUMBER OF DIFFERENT WAYS BEHIND THE CODE LIST, THESE ARE ALL TALKING ABOUT DIFFERENT QUESTION ITEMS THAT ASK ABOUT ALCOHOL FREQUENCY AND THEY REALLY NEED TO BE FROM A CONCEPTUAL LEVEL THERE NEEDS TO BE A SINGING L ID, TALKING ABOUT OR ASKING A QUESTION ABOUT ALCOHOL FREQUENCY, AND THEN ALL OF THESE ITEMS SHOULD BE REPRESENTED IN ONE PLACE AND THERE'S A WAY TO TIE A SPECIFIC INSTRUMENT TO WHICH ARE THE ALLOWED VALUES. RATHER THAN HAVING A SERIES OF CODE LISTS. THIS MAKES IT HARD TO DO THE KINDS OF LONGITUDINAL COMPARISONS WE NEED TO BE ABLE TO Dp7 THESE ARE MY SUMMARY COMMENTS ABOUT THE FIRST FEW SLIDES. THE ONE I DIDN'T MENTION IS SKIP PATTERNS ARE INCREDIBLY IMPORTANT AND THEY'RE EMBEDDED NOW IN WORD DOCUMENT. SO WE CAN'T EASILY GET THAT OUT COMPUTATIONALLY. SO I WAS ANTICIPATING COMMENTS FROM PEOPLE. EVERYBODY AGREED? >> I'M JESSICA GRAYBURN WITH THE PROGRAM OFFICE. I DON'T DISAGREE WITH ANYTHING YOU HAVE SAID TO DATE. SOME OF THE CHALLENGES WE FACE HAVE TO DO LOOKING AT THE CODE FRAMES. IF WE'RE USING EXISTING VALIDATED INSTRUMENTS WE CAN'T NECESSARILY NOR DO WE WANT TO NECESSARILY CHANGE THE CODE FRAME. SO THERE SOMETIME IS THE NEED FOR FRAMES THAT SEEM TO REPRESENT THE SAME THING. ANOTHER ISSUE IS MODE OF ADMINISTRATION. MIGHT BE AN APPROPRIATE CODE FRAME FOR AN PERSON PERSON VER VIEW NOT SURE WE MAKE THESENT PRODUCTS EFFICIENT AS POSSIBLE BUT ACCOUNT FOR VARIATION WE HAVE TO HAVE. >> SO THAT IS ANOTHER GOOD POINT. IT POINT OUT WE NEED CONCEPT IDs BEHIND EACH CODE SET TO SAY WE'RE STILL DEALING IN ALL THOSE CASES, DEALING WITH ALCOHOL FREQUENCY QUESTIONS. THEY ARE COMING FROM POTENTIALLY DIFFERENT INSTRUMENTS VALIDATED IN DIFFERENT WAYS, WE CAN'T CHANGE THE INSTRUMENT ITSELF BUT YOU CAN CHANGE HOW YOU HARMONIZE IT AND HOW YOU POINT OUT THAT IN FACT MOBILE INSTRUMENTS ARE ASKING THE SAME THING. THERE SHOULD BE SOME WAY TO WALK THROUGH AND SAY THIS PARTICIPANT OVER TIME IS SAYING THE SAME THING EVERY TIME. >> SO THE STORY PROBABLY ISN'T AS BAD. AS IT WAS JUST STATED. THERE IS THIS CONCEPT OF MASTER DATA ELEMENT WHICH SPECIFIC CODE LISTS RELATE BACK TO BUT THAT HASN'T BEEN EXPOSED. WE HAVEN'T REALLY PROMOTED THAT IN THE METADATA TO THIS POINT BUT IT'S PART OF THE METADATA. >> I WILL ADD JUST AS IN SOME OF THE OTHER NIH INITIATIVES, LIKE DR. FOREST REFERRED TO, THE PATIENT REPORTED OUTCOME INFORMATION SYSTEM SAID YOU USE DIFFERENT WORDS AT DIFFERENT LIFE STAGES TO GET AFTER THE SAME CONCEPT. WHAT WE HAVE BEEN DOING, I THINK PART OF THE FRUSTRATION IS THAT WE HAVE BEEN PRIORITIZING THE OPERATIONAL DATA ELEMENTS. THAT'S WHAT WE'RE INTERESTED IN. ALL THE OTHER QUESTIONS ABOUT DID SOMEONE EAT CHOCOLATE COVERED RAISINS, IF SO, HOW OFTEN DURING PREGNANCY. AND WHAT NOT. THOSE ARE PLACE HOLDERS. WE WERE ESSENTIALLY TAKING WHAT WAS OUT THERE, WHAT WAS CONVENIENT, WHAT HAD BEEN VETTED BY OMB, WHAT SOMEONE SAID THIS IS A VALID INSTRUMENT AND YOU CAN USE PART AND IT WON'T COST YOU ANYTHING AS OPPOSED TO HAVING A SHELL OUT FOR SOMETHING ELSE. AT A GREAT COST. SO ALL THOSE THINGS WERE FILLER SO WE CAN DO A VISIT. THOUGH WE WILL ANALYZE THESE DATA NOT THE MAJOR ANALYTIC DATA SETS. WE'RE LOOKING FOR ANALYTIC DATA SETS THAT HOVER AROUND 500 TO 800 DATA ELEMENTS. BUT WHAT I'M LEARNING FROM THE CONVERSATIONS WE HAD BUT I THINK WHAT DR. KUBICK IS ANALYZING TODAY, WE'RE GETTING TANTALIZINGLY CLOSE ALMOST BY ACCIDENT, NOW IT'S TIME TO GET SERIOUS AND THINK ABOUT IT AS WE GO FORWARD WHEN WE GET TO FORMING CONTENT. IS THAT AN ACCURATE -- >> I THINK THAT'S A GREAT COMMENT. IN FACT SOME OF THOSE FILLER QUESTIONS HAVE TAKEN ON A LIFE OF THEIR OWN WHICH PROBABLY HAS BEEN TO THE DETRIMENT OF MANY OF THE STUDY CENTERS. SO I WANT TO KEEP GOING HERE. I HAVE UNFORTUNATELY MORE COMMENTS. A QUICK COMMENT THAT WAS ALREADY MENTIONED MDS IS AVAILABLE IN EXCEL AND INSTRUMENTS AVAILABLE ADS WORD DOCUMENTS. LOVE TO SEE IT BE AVAILABLE AS COMPUTATIONALLY TRACTABLE ELEMENTS. I'LL MAKE A COMMENT AT THE VERY END AT THAT AS WELL. SO TO SKIP A BIT AND SAY HOW WE MAKE SENSE OF THIS AND HOW DO WE MAKE IT WORK FOR US. WE HAVE THE NCS NAVIGATOR, WHAT WE HAVE CALLED THE SYSTEM THAT HOLDS DATA FOR THE NCS. AND WE HAVE A WAREHOUSE THAT IS THE REPRESENTATION OF THE MDES AND HAS ALL DIFFERENT VERSIONS OF THE MDS IN IT. THAT INFORMS HOW WE BUILD THE INSTRUMENTS, THE DATA COLLECTION INSTRUMENTS, HOW WE CAPTURE OPERATIONAL DATA THAT'S DOWN IN SOUTH PORTAL AND HOW WE REPORT THAT THROUGH TO THE DER. RIGHT NOW A BIG PAIN POINT FOR US IS WE HAVE TO DO AS WE PULL THE SCHEMA IN THERE IS A FAIRLY COMPLEX SET OF HEURISTICS TO RUN TO TRY TO INTERPRET EXISTING DOCUMENTATION AND GET IT INTO A COMPUTABLE FORMAT. THIS IS EXACTLY WHAT JAY WAS TALKING ABOUT WITH THE MDR, THEY'RE TAKING AN APPROACH TO BUILDING THE EXACT SAME REPRESENTATION. THAT COMPUTABLE REPRESENTATION, WE ALSO NEED TO SEE THAT EXPOSED AS RDS SO WE CAN CONSUME IT. THEN WE DONE HAVE TO SPEND TIME OVER HERE. IN FACT YOU GET THAT COMPUTABLE REPRESENTATION AND HAVE THAT AVAILABLE TO US IN REAL TIME. I WON'T GO THROUGH ALL THE OTHER STEPS THAT WE HAVE TO GO THROUGH BUT THEY'RE AUTOMATED. SO THE THINGS THAT AREN'T RED WE CAN AUTOMATE. THINKING ABOUT INSTRUMENTS WE HAVE RIGHT NOW, VERY DIFFICULT TIME WITH EACH OF THE INSTRUMENTS SO WE DO MANUAL REVIEW AS WE GET A NEW VERSION. I EXPECT EVERY SITE HAS TO DO THAT. RIGHT NOW THERE'S 3200 QUESTIONS THAT WE HAVE TO INSPECT AND IF THERE'S CHANGES WE HAVE TO MANUALLY INSPECT. THAT'S A LOT OF WORK. AND I THINK EVERY SITE IS GOING THROUGH THAT. THIS IS JUST TO SEE WHAT IT LOOKS LIKE AFTER WE HAVE DONE ALL THE REVIEW, YOU CAN SEE THERE'S -- THERE ARE THERE'S A CODING AND THAT'S HOW WHICH DISPLAY THE INSTRUMENT. FINALLY LOADING DATA INTO THE MDES OUR WAREHOUSE WE TAKE THE SPREAD SHEETS, DOCUMENTATION, JUST A DIFFERENT VERSION OF TWO SLIDES AGO HOW WE CAN GET THE DATA REPRESENTED, SOMETHING HERE IN THE MIDDLE IS WHAT THE ACTUAL PROTOCOL WORK FLOW. THAT'S A PIECE WE USE, THE PATIENT STUDY CALENDAR FOR. I'M GOING TO GO THROUGH THE PATIENT STUDY CALENDAR AND TALK ABOUT LESSONS WE LEARNED WORKING WITH caBIG AROUND SEMANTICS HARMONIZATION, THE MAIN MODELS SO THAT'S THE CASE OF BOTH CDISC AND NCI AND VERSIONING. SO YOU'LL SKIP OVER SLIDES BECAUSE THEY'RE NOT RELEVANT HERE. BUT THE PATIENT STUDY CALENDAR REPRESENTS THESE THREE FUNCTIONS SO HOW DO YOU CREATE A TEMPLATE TO REPRESENT THE STUDY. HOW DO YOU CREATE A PLANNED DOCUMENT, WHAT'S GOING TO HAPPEN FOR A GIVEN PARTICIPANT. HOW THAT HAPPENS. HOW DO YOU GO ABOUT TRACKING THE CHANGES. SO THIS IS ANOTHER VIEW OF BRIDGE. YOU CAN SEE LOTS OF LITTLE PIECES TO IT. I WON'T GO THROUGH ALL OTHER THAN TO SAY WE HAVE THE PATIENT STUDY CALENDAR AND WE HAVE THESE COMPONENTS TO IT. SOMETHING WE HAVE GOTTEN FROM THE STTM AND CDISC WAS THE IDEA TO HAVE PLANS, SCHEDULES, TO THOSE PLANS AND SCHEDULES. SO IT PROVIDES THE BACKBONE FOR ALL THIS. YOU CAN SEE AS YOU WALK THROUGH A VERY SMALL PART OF THE DATA MODEL, IT HAS IMPLICATIONS HOW THE ACTUAL APPLICATION IS BUILT. MY POINT TO THAT PREVIOUS SLIDE TO SAY THERE IS THIS END REPRESENTATION AND WE HAVE HEARD THAT FROM A NUMBER OF PEOPLE. YOU CAN HAVE PAPER, TELEPHONE, ELECTRONIC CASE REPORT FORMS AND TRACK AND UNDERSTAND DIFFERENCES BETWEEN EACH OF THOSE. YOU HAVE TO GO THROUGH THAT IN THE HARMON SHAITION PROCESS TOO. -- HARMONIZATION PROCESS TOO. HARMONIZATION ITSELF DOESN'T GUARANTEE SEMANTIC CONSISTENCY. PART OF THAT IS PARTICULARLY WHEN YOU'RE DEALING WITH A DOMAIN MODEL VERSUS APPLICATION MODEL, THERE'S INTERPRETATION FROM THE DOMAIN TO HOW PEOPLE ACTUALLY USE IT. I LOVE TO SEE HOW THE DDI HANDLED THAT PARTICULAR ISSUE VIEWED FROM APPLICATION STANDPOINT OR DEALING MORE DOMAIN LEVEL OR HOW YOU WALK BETWEEN THOSE TWO. THIS IS ONE THAT'S TRICKY. IF YOU LOOK IN THE CADSR, ADVERSE EVENT GRADE, THERE ARE OVER 400 TERMS FOR ADVERSE EVENT GRADE. THAT'S PARTLY BECAUSE THE CADSR HAS TO REPRESENT EVERYBODY'S INTERPRETATIONS. THAT'S SOMETHING FOR THE NCS, WE WANT TO HAVE A SINGLE REPRESENTATION OF THE GIVEN ITEM. SO I'M GOING TO SKIP OUT OF OUR caBIG EXPERIENCE AND MOVE INTO OPEN BUY MEDICAL ONTOLOGY WORLD. TRISH WETSELL IS HERE, SHE'S PART OF THE NCBO. THAT'S A NATIONAL CENTER FOR BIOMEDICAL ONTOLOGIES. AND OBO ACTUALLY PRE-DATES THE NCBO BY PROBABLY FIVE YEARS. GENE ONTOLOGY IS THE QUINT ESSENTIAL OPEN BIOMEDICAL ONTOLOGY. P FOR THOSE NOT FAMILIAR WITH GENE ONTOLOGY I'LL GO THROUGH A FEW THINGS QUICK, WHAT IS INTERESTING TO ME AND I HAVEN'T THOUGHT ABOUT THIS WHEN I DREW UP THESE SLIDES, GENE ONTOLOGY IS NORMAL FUNCTION. FUNCTION, POSITION AND MOLECULAR -- MOLECULAR FUNCTION, BIOLOGICAL PROCESS AN LOCALIZATION. THAT NORMALLY HAPPENS FOR A GIVEN GENE PRODUCT, NOT THE ABNORMAL FUNCTION. A LOT OF WHAT WE THINK OF WHEN LOOKING AT PARTICULARLY CLINICAL RESEARCH WE'RE THINKING ABOUT WHAT'S ABNORMAL. AGAIN, CHRIS REALLY BROUGHT THIS UP, HOW DO WE THINK ABOUT HEALTH. GENE ONTOLOGY THINKS ABOUT THE NORMAL FUNCTION. OR REPRESENTS NORMAL FUNCTION. SO IT IS A DIRECTED GRAPH, WITH STANDARD SEMANTICS. THE GO CONSORTIUM RUNS A SET OF WORKSHOPS EVERY YEAR TRAINING AN TAITORS HOW TO USE IT APPROPRIATELY. SO THERE'S CONSISTENCY THAT'S OUT IN THE COMMUNITY. HAS A UNIQUE NAME SPACE, SUPPORTS LOGICAL DEFINITIONS, CONCEPT IDs EVERY SINGLE TERM HAS CONCEPT ID AND UNLESS THE CONCEPT CHANGES IT'S IMMUTABLE. THAT IS AN IMPORTANT IDEA. CONCEPT IDs DON'T CHANGE BECAUSE YOU CHANGE CODING. IT SPORES OTHER REFERENCE, OTHER ONTOLOGYINGS, VOCABULARIES. YOU'RE FREE TO SHOW WHAT'S EITHER AN EXACT SYNONYM OR PARTIAL OVERLAP WITH A GIVEN CONCEPT. SUPPORTS MOBILE RELATIONSHIPS AND ATOMICALLY VERSIONED. THIS IS A BIG DEAL. THAT MEANS YOU CAN GO BACK OVER THE 12 YEARS OF GENE ONTOLOGY AND SEE EVERY SINGLE CHANGE THAT EVER HAPPEN, YOU CAN SEE THE COMPLETE HISTORY OF EVERY SINGLE TERM IN GENE ONTOLOGY. SO TALKING BRIEFLY ABOUT DISEASE ONTOLOGY, THIS IS A PROJECT I'M -- THE PI ON. JUST UP THE ROAD HERE. THE TWO OF US HAD THE PLEASURE OF TAKING DISEASE ONTOLOGY AND MAKING IT MORE USEFUL ONTOLOGY, I DON'T WANT TO TAKE A LOT OF TIME TALKING ABOUT DISEASE ONTOLOGY BUT A FEW THINGS WE HAVE DONE AS FAR AS MAKING IT A RICHER REPRESENTATION. HERE IS THE LOGICAL DEFINITION FOR ONE PARTICULAR DISEASE. YOU CAN SEE A ID WITH THE DOE ID NAME SPACE AND A UNIQUE NUMBER. IT HAS A HUMAN READABLE DEFINITION. WE THINK IT HUMAN READABLE. THEN IT HAS ID TO THE PART AND LOGICAL DEFINITIONS TO LET YOU REACH IEWT TO OTHER ONTOLOGIES OR OTHER RELATIONSHIPS. SO FOR INSTANCE, THIS INTERSECTS WITH THE FOUNDATIONAL MODEL OF ANATOMY, AND THAT P HAPPENS TO BE SKIN. THE AGE AI GENT THAT CAUSES THIS DISEASE, HAS NCBI TAX ON ID 6248 SO MOVE BETWEEN DISEASE ONTOLOGY UNAMBIGUOUSLY TO THESE OTHER REFERENCES. YOU CAN SEE A NUMBER OF OTHER ONTOLOGIES. WHAT THAT ENABLES YOU TO DO IS VISUALIZE A LOT OF RELATIONSHIPS. SO HERE IS INHALATION ANTHRAX, YOU CAN SEE A NUMBER OF SYMPTOMS IN THE HUMAN PHENOTYPE ONTOLOGY THAT HAS A TRANSMISSION ID WHICH IS AIRBORNE TRANSMISSION METHOD. ANTHRAX IS ANTHRAX, A LUNG DISEASE, IT IS FOUND IN THE LUNG, FMA TERM FOR LUNG. WHAT THIS ALLOWS YOU TO DO IS TO GO INTO OTHER ONTOLOGIES AND UNDERSTAND RELATIONSHIPS THERE. SO YOU CAN YOU COMPLICATED COMPUTATION AND INFERENCE BASED ON THAT. SO JUST ONE IT IS A MALIGNANT NEOPLASM AND SO FORTH. MAKES AVAILABLE FOR RDF. AND THE REAL POINT TO THE ALL THIS IS THEN YOU CAN START TO COUPLE IT WITH ACTUAL DATA SETS AND SEE THIS HAPPENS TO BE A GENE EXPRESSION DATA SET AND THERE ARE LOTS AND LOTS OF GENES THAT ARE COUPLED THEN TO VARIOUS DISEASES AND YOU CAN SEE WHICH GENES ARE INVOLVED IN MOBILE DISEASES VERY QUICKLY, YOU CAN INSPECT THAT, VERY, VERY POWERFUL AND ALL COMES OUT OF THE FACT THAT ALL THESE TERMS ARE INTEROPERABLE BETWEEN VARIOUS ONTOLOGIES. SO THAT WAS MY PITCH FOR A VERY DIFFERENT VIEW OF HOW OTHER GROUPS ARE HANDLING EXACTLY SOME OF THESE METADATA ISSUES OR REALLY SEMANTIC ISSUES. THIS ISN'T REALLY METADATA IT'S SEMANTIC. SEMANTIC RELATIONSHIPS. AND THIS IS REALLY WHAT I SAID IN THE EARLIER SLIDE, I'M NOT GOING TO RECAP IT. EXCEPT FOR I WOULD SAY THE ONE THING I DIDN'T MENTION IS OURANT TO MAINTAIN SPECIFIC DATA ELEMENTS AND MAP OVER TIME IS REALLY CRITICAL FOR US. BECAUSE WE DO KNOW ALL THESE INDIVIDUAL INSTRUMENTS ARE GOING TO CHANGE. I THINK -- I HAVE THE RECOMMENDATION SLIDE. I CAN'T SKIP THAT. SO JUST AGAIN, I DON'T WANT TO THROW OUT ALL THE GOOD WORK DONE TEXT FILES ARE INCREDIBLY IMPORTANT, THEY HAVE TO BE STRUCTURED. MORE STRUCTURE VALIDATION. WE NEED TO USE VERSION TOOLS. WHAT I DIDN'T MENTION WAS OBO, ALL THE COMMITS OF EVERY FILE GO INTO A CENTRAL OPEN SUBVERSION REPOSITORY. AND LOOK AT DIFFERENCES IN ANY DOCUMENT AND SEE WHO MADE THE SUBMISSION, WHEN AND WHAT CHANGES WERE MADE. I THINK AS A COMMUNITY WE WANT TO RESTRICT WHO CAN WRITE TO ALL OF OUR DATA FILES BUT WE WANT EVERYONE TO SEE IT, VET IT AND MAKE SUGGESTIONS. THE LAST THING I WANT TO MENTION, I HAVE SEEN THIS IN A LOT OF INSTRUMENTS THAT COME OUT OF THE NCS, IT LOOKS LIKE THERE ARE THERE IS COMPUTABLE VERSIONS OF A LOT OF DOCUMENTATION BUT THEN IT'S HAND COATED INTO SOMETHING HUMAN READABLE. AND THEN IN THAT PROCESS PEOPLE MAKE CHANGES AND PUT IT BACK TO THE COMPUTABLE FORMAT. BOTH OF THOSE THINGS INTRODUCE HUMAN ERROR. I WOULD STRONGLY ADVOCATE THAT WE SHOULD HAVE A PROCESS REPRODUCIBLE, AUTOMATED THAT WE TAKE THEN THE COMPUTABLE REPRESENTATIONS AND CREATE THE HUMAN READABLE FORMATS. I THINK THEN A LOT OF A LOT OF ISSUES WE SEE WITH CURRENT DOCUMENTATION WILL GO AWAY. EVERYTHING IS MORE SEMANTICALLY CONSISTENT. I BELIEVE THAT'S THE END. I WILL LET WAYNE KUBICK COME UP AND TALK AB HIS EXPERIENCE WITH SOME OF THESE SAME ISSUES. >> WHILE TRANSITIONING, ANY QUESTIONS FOR WARREN? >> I'M VERY HAPPY TO TAKE QUESTIONS IF I DIDN'T SAY IT. >> HELL LOASM THANKS FOR THE OPPORTUNITY TODAY. I'M GOING TO REPEAT MOST OF WHAT THE OTHER SPEAKERS HAVE TALKED ABOUT TODAY. AND I ALSO HAVE BASIC STUFF YOU MAY KNOW VERY WELL. SO I WOULD LIKE TO TALK ABOUT OUR HOW TO MANAGE THE REQUIREMENTS SEND BY THE PROGRAM OFFICE INSTRUMENTS THEN AND THAT'S NEW VERSIONS, HOW DO YOU HANDLE THAT. THEN I DON'T MEAN TO SUGGEST MECHANISM PROGRAM OFFICE WHAT DATA TO CALL THAT, RATHER I MEAN TO SAY WHAT MECHANISM AFTER THE DATA DECIDED THAT CAN BE (INAUDIBLE) CENTERS WE CAN EASILY MAKE IT COMPETABLE BY OUR SYSTEMS. FOR THAT PURPOSE I WILL TALK ABOUT DATA ELEMENTS, CDSR, THEN METADATA REPOSITORY POTENTIAL OPTION OPEN EN, OPEN SOURCE AND HOW WE CAN USE THOSE TOOLS TO CREATE MAYBE A COMPLETABLE SOLUTION. THIS IS OUR OVERVIEW. WE HAVE (INAUDIBLE) TO COLLECT INSTRUMENT DATA. WE HAVE (INAUDIBLE) COLLECT BY PARTICIPANTS AND CONSENT INFORMATION. AND PATIENT TAS ACCOUNTED FOR CONTACT AND EVENTS RELATED SIMILAR TO WHAT (INAUDIBLE) ARE USING. WHAT WE DO (INAUDIBLE) ALL THESE DATA FROM DIFFERENT APPLICATION TO A STAGING DATABASE, THEN AT THE STAGING LEVEL WE USE TALENT OPEN SOURCE ETL PLATFORM, DO TRANSFORMATION, PUT IN DATA WAREHOUSE AND EXAM FOR VDR SUBMISSION. SO TO THE DETAILS. ONE PERSON PROGRAMS THE WORD DOCUMENTS SEN BY PROGRAM OFFICE INTO THE (INAUDIBLE). SURVEY. ANOTHER PERSON TESTS WHETHER WE HAVE IMPLEMENTED THOSE LOGICS THE WAY IT'S SUPPOSED TO BE. AFTER THAT PRODUCTION DATA COLLECTION GOES TO FIELD AND COLLECT DATA. THEN ANOTHER PERSON IS CREATING PRETTY MUCH THE SAME JOB PULLING ALL THE ELEMENTS AND THE SURVEY OR INSTRUMENT AN TRYING TO MAP INTO OTHER REQUIRED COMPONENT USING TRANSFORMATION TMAP ENGINE FOR THAT REQUIREMENT. AGREEMENT STAGING DATABASES AND OTHER PERSON, INSTITUTIONS AND OTHER REQUIRED DATA BEING PULLED INTO THE TRANSFORMATION ENGINE AND THE WAREHOUSE. SO THIS IS A ZOOM IN VERSION OF OUR TEAM UP TRANSFORMATION, YOU CAN SEE THE GREEN AREAS IF I'M NOT MISTACKEN ADDRESS LINE 2, SOMETHING LIKE THAT BEING TRANSFORMED OVER HERE AND GOES TO ONE LINE HERE. CONSIDERING WE HAVE LIKE 3200, IT'S A CHALLENGE B AS YOU CAN IMAGINE. IF THERE IS A CHANGE IN THE VERSION OF INSTRUMENT COUNTS WE NEED TO REDO ALL THIS STUFF, OVER HERE, AND AFTER 90 DAYS IF WE HAVE A NEAR (INAUDIBLE) VERSION WE HAVE SOMEBODY NEED TO WORK ON THIS SIDE OF THINGS. IF YOU LOOK AT THE ADDRESS BAR WE HAVE AN EXPRESSION WRITTEN TO PROPERLY FORMAT OR PUT THE REQUIRED DATA INTO FOR EXAM AL A SUBMISSION. SO AFTER THAT YOU CAN SEE FOR PREGNANCY WE HAVE FOUR DIFFERENT JOBS SIMILAR TO YOU HAVE SEEN EARLIER AND WE HAVE CREATED A (INAUDIBLE) THAT RUNS ALL THOSE FOUR TO CREATE THE FINAL STATE AND THREE MONTH PREGNANCY VISIT. VERY TIME CONSUMING. SO THURSDAY IS A NIGHT MARE FOR OUR ADMINISTRATORS BECAUSE THEY HAVE TO CREATE XML FIND BUT WE HAVE A WORKING MODEL RIGHT NOW THAT IS MUCH MORE COMFORTABLE SUBMITTING DATA AT THE MOMENT. OUR BIGGEST PROBLEM RIGHT NOW IS THE VERSION MATCHING THE INSTRUMENT VERSION WITH (INAUDIBLE) VERSION BECAUSE BOTH SIDES HAVE TALKED TO EACH OTHER SO WE WON'T MAKE MISTAKE. THIS IS HOW WE'RE DOING CURRENTLY. SO I GO INTO APPROACH THAT MIGHT HELP US WORK ON THE FIELD. THAT IS COMMON DATA ELEMENTS. SO I THINK WE HAVE HEARD A LOT ABOUT COMMON DATA ELEMENTS AND THE BENEFITS ARE MINIMIZING THE TIME REQUIREMENT AND COST DEFINITELY FOR THEIR COLLECTION, STANDARDIZED AND DATA COLLECTION IMPROVING SEMANTIC AWARENESS THAT'S THE COMPONENT HERE. AND A LOT OF INSTITUTES ARE USING COMMON DATA ELEMENTS CURRENTLY. WHAT IS DATA ELEMENT? IT CONSISTS OF (INAUDIBLE) WITH THE PROPERTY DATA ELEMENT CONCEPT THAT CAN BE ADVERSE EVENT GRADING. THEN YOU DEFINE A PHYSICAL REPRESENTATION THAT CONSIST OF DATA ELEMENT. IT HAS TO MATCH ONE DOMAIN BUT YOU MIGHT HAVE MANY VALUE DOMAINSFUL EARLY EXAMPLE, DATA ELEMENT FOR (INDISCERNIBLE) AND OTHER VALID DOMAIN, THAT MIGHT BE ANOTHER DATA ELEMENT. SO THIS IS A SIMPLE EXAMPLE, SO WE HAVE A PERSON AS AN OBJECT, CLASS, (INAUDIBLE) OBJECT PROPERTY AND THAT CONSTITUTES THE DATA ELEMENT CONCEPT AND WE HAVE VALUE DOMAIN OF NUMERIC AND NON-ENUMERATED. THIS IS ONE SAMPLE DATA ELEMENT. CDSR IS CANCER STANDARDS REPOSITORY AN REPRESENTATION I BELIEVE, THE DATABASE HAS BEEN PUT BY THE NCI SET OF APIs AND TOOLS TO CREATE EDIT CONTROL DEPLOY AND FIND COMMON DATA ELEMENTS FOR USE BY CONSUMERS. SO YOU CAN SEE WE USE CD BROWSERS SCREEN, SHOWN EARLIER, WE HAVE OUR OWN -- WE CREATE OUR OWN COMMON DATA ELEMENTS FOR CANCER OTHER CLINICAL RESEARCH AND YOOS -- USE A BROWSER. ONE COMMON EXAMPLE IS RACE. THAT'S THE COMMON EXAM FELONY, IT DEFINES WHAT IT MEAN, (INAUDIBLE) APPROACH AND THE LONG NAME AND NUMBER, THAT SHOWS THE DATA ELEMENT ID. THE (INAUDIBLE) AS THE IMPORTANT PART IS WHAT IS THE MINIMUM LENGTH AND OTHERS SORT OF METADATA ABOUT YOUR DATA ELEMENT CONCEPT. HERE THE PERMISSIBLE VALUE IS DEFINED FOR THAT PURPOSE. THERE ARE SEVEN PERMISSIBLE VALUES. SO THAT'S A HIGHLY CONTROLLED DATABASE AND ONE THING I FORGET TO MENTION THOSE DATA ELEMENT CONCEPTS, SHOULD BE FROM NCR TO (INAUDIBLE) WE CANNOT CREATE IF YOU WANT TO MAKE IT COMMON. ONE THING THAT WOULD BE HELPFUL IS THE REPOSITORY SPECIFIC TO NCS. FOR THAT PURPOSE OPEN NDR, ONE OPTION THAT CAN BE CONSIDERED, THAT IS (INAUDIBLE) PROLIFE METADATA MANAGEMENT CAPABLIBILITIES WITH SEVERAL COMPONENTS, (INAUDIBLE) AND DOMAIN MODEL GENERATORS. SO, ISO 11179 COMPLIANT DATABASE, SO YOU CAN USE SOME GRID SERVICE TO DISCOVER EXISTING ONES OR ANNOTATE NEW AND MAKE IT AVAILABLE TO EVERYBODY. MDR CORE IS THE MAIN COMPONENT WHERE AS I SAID ISO 11179 COMPLIANT SEMANTIC METH DAYTIME. (INDISCERNIBLE) AND IT HAS MODEL DEVELOPMENT FEATURES THAT I WILL BRIEFLY SHOW. SO THIS IS A SAMPLE OF OPEN NDR SO WE CAN CREATE ONE CONTEXT FOR THE NCS FOR INSTANCE AND START CREATING OUR DATA ELEMENT CONCEPT, PATIENT GENDER WHEN WE GO TO NCI, DEFINES THESE CONCEPTS, SOPHISTICATIONS. THIS IS THE VALUE DOMAIN DEFINITION HOW YOU WANT TO MAXIMUM MINIMUM CHARACTERS, WHATEVER YOU WANT TO DEFINE. WE ARE CREATING A NEW DATA ELEMENT. AS YOU MAY REMEMBER NEW DATA ELEMENT CONSISTS OF BRINGING DOMAIN ELEMENT CONCEPT SO WE HAVE DATA ELEMENT CONCEPT DEFINED HERE AND AND OUR DATA ELEMENT IS CREATED. HOW CAN WE USE THIS NCS CONTACT. SO WE CAN HAVE A REPOSITORY OPEN NDR IN PLACE, (INAUDIBLE) THAT MIGHT BE USED AS AS WELL. SO (INDISCERNIBLE) METADATA REPOSITORY. WE CAN USE UML MODELING USING ENTERPRISE ARCHITECT OR SIMILAR TOOLS TO CREATE DOMAIN MODELS. AND THEN STUDY CENTERS SUCH AS OURS CAN CONSUME THOSE ONLINE OR YOU CAN EXPORT THOSE FILES AND USE FILES AS WELL, PROBABLY THE BEGINNING (INAUDIBLE). THIS IS FROM THE ENTERPRISE ARC TECH, CREATE THE LOGICAL MODEL AND DEFINE WHAT YOU WANT. THEN YOU CAN GO BACK TO THE DATA MODEL REPRESENTATION OF YOUR UML. USING OPEN MDR PLUGGING YOU CAN PLUG YOUR DATA MODEL WITH THE DATA ELEMENT REPRESENTATION. FOR INSTANCE, PARTICIPANT WHEN YOU SEARCH OPEN MDR INTERFACE, OPTIONS DATA ELEMENTS. WHEN YOU HIT ANNOTATE WITH WITH CDE IT NOTATE THAT DATA ELEMENT AND DATA MODEL REPRESENTATION. THEN AFTER YOU ANNOTATE ALL THOSE DATA ELEMENTS THEN YOU CAN EXPORT THE XMI DIDZ DIZ INTERCHANGE -- (INDISCERNIBLE) STUDY CENTERS OR YOU CAN HAVE ONLINE SERVICE FOR THAT PURPOSE. WE CAN CREATE OUR OWN PORT OR OTHERS CAN CONSUME THAT XMI TO CREATE THEIR INSTRUMENT INSTEAD OF WORD DOCUMENT. SO IN SUMMARY, I PRETTY MUCH REPEATED THIS. SO WE CAN HAVE XML FILES IMPORTANT TO OUR INSTRUMENTS. THERE ARE SOME TOOLS LIKE ENTERPRISE ARCHITECT, WHEN YOU MADE A CHANGE IN YOUR DOMAIN MODEL IT CREATES THE (INAUDIBLE) TO UPDATE YOUR VDR. YOU CREATE A ROW DOE MAIN MODEL THERE'S STUDY CENTERS AND THEN YOU U MAKE CHANGE ON YOUR ANIMAL MODEL, THIS TOOL PROVIDES FEATURES FOR YOU TO UPDATE THAT DATABASE STRUCTURE. IT'S GOING TO BE MUCH EASIER FOR US TO MAP THOSE INSTRUMENTS, NOT GOING TO BE -- BECAUSE WE DON'T HAVE SKIP LOGIC HERE BUT EASIER FOR US TO MAP DATA ELEMENTS. THEN WE CAN USE THEIR STRUCTURES AND SOME OF OUR DATA TO VDR. THAT'S IT. I'LL BE HAPPY TO ANSWER ANY QUESTIONS. >> I WANT TO GET THIS IN BECAUSE I HAVE TO GET AN EARLIER FLIGHT OUT. YOU HAD ASKED ME ABOUT WHAT DDI DEALS WITH, HOW TO SEPARATE THE CONCEPTUAL AND THE APPLIED USE OF THINGS. AND ALSO SKIP LOGIC AND THINGS LIKE THAT. BASICALLY THE APPROACH, THE DDI LIFE CYCLE HAS BEEN B TO SEPARATE AS MUCH AS POSSIBLE THE COB CONCEPTUAL COMPONENTS. CATEGORY, CODE SCHEMES ASSOCIATED WITH CATEGORIES. ALL THOSE AS REUSABLE OBJECTS AND REUSABLE PARTS AND YOU BUILD FROM THEM SO YOU BUILD QUESTIONS USING THE CONCEPTS THAT YOU'RE DEVELOPING A QUESTION ABOUT, THE QUESTION MAY HAVE DYNAMIC NEXT IT SO THAT YOU CAN -- YOU CAN DEAL WITH THE ISSUES OF CHANGING VERBIAGE FOR AGE GENDER, WHATEVER. AND FROM THAT YOU BUILD CONTROL WHICH IS BASICALLY YOUR SKIP LOGIC AND PATTERN. THAT IS THEN FED INTO AN INSTRUMENT OR MULTIPLE INSTRUMENTS TO BE EXPRESSED IN DIFFERENT WAYS AND MAYBE DIFFERENT STEPS TO BE HANDLED IN DIFFERENT WAYS BECAUSE OF THE GNAWTURE OF THE INSTRUMENT. WHAT WE HAVE TRIED THE DO IN THAT DEVELOPMENT PROCESS IS BACK THING'S SEPARATE THINGS OFF AND THEN AS YOU GET DOWN TO THOSE DETAILED APPLIED USES OF THING, VARIABLES, QUESTION WITHIN A QUESTIONNAIRE, WHAT YOU EAR DEALING WITH IS A STACK OF POINTERS AND THE APRIED INFORMATION. SO YOU HAVE -- WE TRY TO KEEP A DIFFERENCEIATION BETWEEN THOSE TWO. -- DIFFERENTIATION BETWEEN THOSE TWO. PART OF WHAT I'M SEEING IN THE DESCRIPTION YOU HAVE HAD IS YOU HAVE -- THERE'S A DISADVANTAGE OF NOT BEING ABLE TO GET IN AT THE BEGINNING OF TRYING TO TAKE THE INFORMATION THAT YOU HAVE AND DI ASEMINL AND TRY TO FIGURE -- DISASSEMBLE AND TRY TO FIGURE OUT OUT AND THINS THE PROCESS, A TWO DECADE PROCESS OF GETTING THE STRUCTURES UP THERE, SO THAT EVENTUALLY THOSE INCOMING PIECES ARE BUILT OFF OF THOSE REPOSITORIES OF INFORMATION, SO THAT YOU THEN HAVE THE INFORMATION ABOUT THE RELATIONSHIPS SOMEONE DOING A ONE OFF CONCEPT, WRITEING A SECONDARY CONCEPT AND SAYING IT'S SIMILAR TO BUT THIS IS MY SLIGHT DIFFERENTIATION WITHIN THIS VARIABLE, THIS IS HOW I -- SOMETHING HAS HAPPENED HERE, YOU SHOULD KNOW ABOUT. THERE'S A TWEAK AND A ASSUMPTION THAT WAS MADE. SO THAT IS WHERE DDI WAS COMING FROM, AS WE DEVELOPED THE LIFE CYCLE MODEL AND THERE'S SOME THINGS THERE THAT YOU MAY WANT TO LOOK AT IN YOUR OWN MODELING. >> VERY HELPFUL. >> AS A QUESTION EXPANDING ON THAT, ONE OF THE QUESTIONS THAT CAME UP AROUND SKIP PATTERNS WAS SKIP PATTERNS ARE IMBEDDED IN THE WORD DOCUMENT THAT DESCRIBES THE INSTRUMENT. THE INSTRUMENT ITSELF MAY NOT HAVE BEEN WRITTEN IN ANY WAY, MACHINE READABLY. SUPPOSITION IS THAT IT WILL CONTINUE. WHAT'S THE SOLUTION TO GET THOSE INSTRUMENTS INTO A USE EXCUSABLE FORM SO WE CAN TAKE THE DIRECTION THAT WARREN IS TALKING AB IN THE FUTURE. DEPENDS ON HOW MUCH INDIVIDUALALTY YOU HAVE. IF EVERYBODY IS DOING IT IN THEIR OWN PRIVATE WAY THEN AT MINIMUM YOU HOPE THEY'RE CONSISTENT IN THE OBJECTS THEY PRODUCE SO YOU CAN CREATE A PROGRAM PRA TICK -- PROGRAMATIC MEANS OF DOING THAT. I KNOW IN NATIONAL STATISTICAL OFFICE WE HAVE BEEN DOING IT, YES, THAT IS EXACTLY HOW THEY HAVE BEEN DOING EXIK SURVEYS AD INFI -- EXISTING SURVEYS. YOU CAN ALMOST TELL WHEN PERSONNEL CHANGED. GO THROUGH AND FIND THE CHANGES IN STYLES. BUT BASICALLY GOING THROUGH DOCUMENTS AN RIPPING THEM OUT AND USING A TOOL TO THEN REGENERATE THE SURVEY TO MAKE SURE WHAT THEY WERE GETTING -- WHAT THEY WERE PUTTING OUT REPRESENTED WHAT THEY THOUGHT THEY WERE PUTTING IN. SO IT'SABLE BUT I MEAN, YOU HIT ON THE PROBLEM THAT PLAYER FROM AN AHRQ OF THIS VIEW YOU HAVE BEEN DEALING WITH FOR DECADES IS EVERYBODY DID IT THE BEST WAY. THEIRS. AND THEY CHANGED THEIR MINDS A LOT OVER TIME. SO YEAH, BUT WORD DOCUMENTS ACTUALLY HAVE AT LEAST A LOT MORE -- MORE UNDERLYING STRUCTURE THAN OTHER THINGS. IT REALLY DEPENDS HOW CONSISTENT THEY WERE BUT I THINK THE SOONER IT GETS INTO A SYSTEM WHERE A PERSON CAN EASILY DEVELOP A QUESTIONNAIRE AND A FLOW STRUCTURE USING PRE-MADE OR PRE-SET OR DEVELOP QUESTIONS RESPONSE DOMAINS ARE WE USING THOSE RESPONSE DOMAINS SO IF YOU HAVE A LIE CARD SCALE OR SOME OTHER RESPONSE SCALE YOU'RE EXPRESSING IT ONCE AND REUSING IT MULTIPLE TIMES. IT'S ONE OF THOSE THINGS THAT LARGE PROCESS MODEL HAS AN UP PROBLEM PAY OFF. IF THEIR LIFE IS EASIER IT MAKES YOUR LIFE EASIER. AND WHAT WE HAVE ACTUALLY FOUND IN PEOPLE DEVELOPING TOOLS FOR QUESTIONER DEVELOPMENT WITHIN DDI, ONCE THEY GOT THROUGH THAT PART WHERE THEY HAD DEVELOPED, WRITTEN THE QUESTIONS, DEVELOPED THE QUESTIONNAIRE AND PUT IN PROCESSING CODE, IF YOU WROTE THE PROGRAM FOR IT YOU CAN PUSH A BUTTON AND GENERATE 99% OF THE REMAINING DOCUMENTATION THAT WENT WITH IT. WHICH IS A REAL PAY OFF. >> I HAVE A QUESTION FOR ALL THREE OF THE -- THE THREE LAST SPEAKERS. WE HAVE TALKING METADATA BUT ALSO OPERATIONAL DATA ELEMENTS. AND I WOULD LIKE YOU TO MAKE COMMENTS THE FLAVOR OF METADATA AS RELATED TO OPERATIONAL DEA ELEMENTS WHICH ONCE UPON A TIME WERE PARADATA, WHETHER THE METADATA AROUND PARADATA OR AROUND OPERATIONAL DATA ELEMENTS HAVE ANY DIFFERENCES, INTRINSIC LOGICAL OR OTHERWISE, THAN THE METADATA AROUND WHAT WE'LL CALL CONTEMPT DEA. SINCE YOU HAVE THE PODIUM, WHY DON'T YOU BEGIN. >> AS MUCH AS I CAN. SO TO BE -- I MEAN SINCE AS YOU MENTIONED EARLIER OPERATION DATA ELEMENTS ARE THE MAIN DATA TYPES WE ARE LOOKING FOR. I THINK USING COMMON DATA ELEMENT SAME STRUCTURE WE WERE TALKING ABOUT, OBJECT CLASS WOWK DATA CHECKED, TRAVEL TO PARTICIPANTS' HOUSE OR DEFINE YOUR DATA ELEMENT AND WHAT KIND OF PHYSICAL REPRESENTATION YOU WANT TO CAPTURE. SO TO ME WE CAN EASILY DEFINE THOSE ELEMENTS, WHETHER IT'S CONTACT LIKE INSTRUMENTS OR OPERATION DATA ELEMENTS. IF I'M ANSWERING YOUR QUESTION. >> I'M HEARING MORE THE PROCESS IS INTRINSICALLY THE SAME. AND YOU'RE JUST TARGETING IT SOMEWHAT DIFFERENT. >> I THINK ONE THING THAT WILL BE DIFFERENT AS YOU DEFINE THOSE CLASSES THOSE HAVE TO BE NCI THESAURUS. ASSUMING ONE MIGHT ASSUME THOSE CONTENT ONES ALREADY BEING AND ONES WE MIGHT NEED TO CREATE TO DEFINE MIGHT BE THE NEXT STEP. MIGHT BE A LITTLE MORE WORK BUT POSSIBLE THE SAME WAY. >> INTRIGUING. >> I'M GOING TO MAKE A COMMENT ABOUT A DIFFERENT OPERATIONAL DATA. WHEN IT WAS STARTED, HOW LONG DID IT TAKE TO TAKE THE INSTRUMENT, HOW IS IT EVEN PRESENTED. I THINK THOSE METADATA ARE MUCH MORE SHAREABLE. AND MORE STANDARDIZABLE THAN MORE DOMAIN TYPES OF ELEMENTS WHERE THEY'RE SPECIFIC. SO FOR INSTANCE, EVERY SINGLE INSTRUMENT HAS A START AND END TIME AND IT'S AS LONG AS YOU CAPTURE THE CONTENT OF WHICH -- THE CONTEXT WHICH STRUM IT IS YOU'RE GIVING (INAUDIBLE) IS DISCLOSED. I THINK THERE ARE SOME OF THE OPERATIONAL DATA ELEMENTS THAT WOULD BE EASY TO REPRESENT AND DO IT CONSISTENTLY ACROSS ALL INSTRUMENTS FOR ALL THE VICTIMS, SO IT SHOULDN'T MATTER THERE SHOULD BE OPERATIONAL DATA USUALLY THE IMS AS WELL. HOW MUCH TIME DO YOU SPEND IN A PARTICULAR SCREEN GETTING TO WHERE DO YOU WANT PUT THAT PARTICULAR INSTRUMENT. THOSE ARE ALL AGAIN I THINK FAIRLY INLEARNLY REUSABLE TO USE WENDY'S EXCELLENT WORD. AND THOSE TYPES OF ELEMENTS ARE MUCH EASIER TO DEAL WITH. I HOPE MUCH EASIER TO DEAL WITH. SOME OF THE OTHER OPERATIONAL DATA, WHAT CERTIFICATIONS DO THEY HAVE, THE INDIVIDUALS INVOLVED, THAT'S VERY SPECIFIC TO WHAT THE NCS NEEDS. >> OKAY. DR. KUBICK, DO YOU HAVE ANY THOUGHTS ON ARE THERE DISTINCTIONS? (OFF MIC) >> CORRECT. (OFF MIC) >> ALL RIGHT. I THINK ON OUR PROGRAM WE HAVE THREE MORE SPEAKERS. WE'D LIKE TO END AT 5 BUT I KNOW IT'S SOMETIME AS LITTLE TOUGH TO SIT STILL ESPECIALLY LATER IN THE AFTERNOON SO WHY DON'T WE TAKE ABOUT FIVE MINUTES JUST TO STRETCH AND THEN COME BACK AND CATCH THE OTHER -- THREE SPEAKERS AND THEN TRY TO WRAP UP AND SEE WHAT WE'RE GOING TO PULL OUT OF THIS. >> HEY, GUYS, WE'D LIKE TO RECONVENE. >> I HAVE A BRIEF ADMINISTRATIVE ANNOUNCEMENT. ANYONE WHO WANTS A TAXI SHOULD GO TO THE REGISTRATION DESK. , ANYONE WHO WANTS A TAXI SHOULD GO TO THE REGISTRATION DESK NOW AND GIVE THEM YOUR NAME SO THAT YOU WILL BE ABLE TO GET YOUR TAXI. THE NEXT GROUP OF FOLKS ARE GOING TO TALK A LITTLE BIT ABOUT DATA PRESERVATION AND METADATA AS A WHOLE. AND WE HAVE TWO SPEAKERS, ACTUALLY THREE. THE FIRST ONE IS DR. BILL BLOCK, DIRECTOR OF THE CORNELL -- WE'RE GOING TO START WITH PASTEL, THE EXECUTIVE MANAGER FOR THE OPEN DATA FOUNDATION. HE'S A -- HE KNOWS A LOT ABOUT TOOLS, RELATED TO DDI AND OTHER STANDARDS. PASCAL, YOU WANT TO START? SO (INDISCERNIBLE) OPEN DATA AND METADATA TECHNOLOGY, I HAVE BEEN INVOLVED MANY YEARS IN MICRODATA IN MANY PLACES AROUND THE WORLD AND SO I HAVE ALSO BEEN INVOLVED WITH DDI FOR 10, 12 YEARS. THE NATIONAL STATISTICAL AGENCIES, SO WHAT I TALK TO YOU ABOUT TODAY HAS TO DO WITH SOME OF THE THINGS HIGH LEVEL, SOME METADATA DO WE DO WITH IT. THEN I HAVE A SECOND PART WHICH IS ON DDI WHICH HAS BEEN A TOPIC FOR THE NCS. AND WHEN DONE WITH THAT WE'LL GF YOU A FEW DDI (INAUDIBLE) AND BILL BLOCK FROM CORNELL WILL TALK ABOUT A PROJECT RECENTLY INITIATEDgC CENSUS DEA. WE ALSO HAVE (INDISCERNIBLE) FROM NRC, WHICH IS ONE OF THE SPEAKER. IF YOU HAVE QUESTIONS FOR (INDISCERNIBLE) REMOTE ACCESS TO SENSITIVE DATA, THE PROJECT WE'RE COLLABORATING ON WITHIN RC AND THAT MOOB OF INTEREST FOR THE NCS. ONE OTHER SPEAKER THAT COULDN'T MAKE IT IS JEREMY IEFERSON, HE -- IVERSON. HE COULDN'T MAKE IT TODAY. I DO A LOT OF THINGS WITH METADATA. THE IDEA IS WHAT CAN DATA FOR YOU AND THE SECOND PART IS WHAT CAN DDI DO FOR YOU. SO HIGH LEVEL, YOU KNOW ABOUT THIS. THE DEFINITION OF DATA, IT'S VERY WEB 2.0. THAT'S MISSING DATA BUT IT CAN BE MANY THINGS. WE CAN HAVE MANY DEFINITIONS OF METADATA. I NEED METADATA TO MAKE SOLID DECISION, TO SEARCH FOR THINGS NAVIGATED BY VARY ANTS TO SHARE O VARIANTS TO SHARE KNOWLEDGE. MANY USES OF METADATA. IT WILL MAKE YOUR LIFE EASIER. WITHOUT IT IT WOULD BE PRETTY TOUGH. THIS IS AN EXAMPLE, I GO TO THE STORE, THE SHOP, TOUGH TO BUY, I WANT MORE INFORMATION OBVIOUSLY TO KNOW WHICH KIND OF FOOD YOU'RE GOING TO BUY. MORE ADDITIONAL INFORMATION IF YOU'RE DIET CONSCIOUS IT'S A LOT OF SODA IN THAT CAN, WITHOUT THAT YOU CANNOT MAKE VERY GOOD DEDECISIONS. THERE'S EVERY DAY METADATA AND TWO FLAVORS. IF YOU HAVE READ METADATA AND THEIR STANDARD THAT'S WHAT PEOPLE THINK ABOUT. IT'S ABOUT INFORMING WHAT I'M GOING TO READ ABOUT SOMETHING ELSE. YOU USE IT EVERY DAY IT'S OBVIOUSLY NUTRITION INFORMATION, YOU GO TO THE AIRPORT, YOU RUN BY THE METRO, OF YOU HAVE RESTAURANT MENU AND YOU REMEMBER (INDISCERNIBLE) WAS 15 CENTS IN 1955 AT MCDONALDS. THIS IS ON METADATA TO DECIDE WHICH FOOD YOU'RE GOING THE TAKE. IT'S A DDI TERM USED WIDELY. THIS IS REALLY FOR ME AND THIS IS OBVIOUSLY WE DO NEED DOCUMENTATION. WE NEED TO DELIVER THAT. THIS IS WHERE THE POWER OF METH DAYS COMES FROM. YOU SLIDE YOUR CARDS, METADATA CAN BE EXCHAINED. -- EXCHANGED. THIS IS WHAT A LOT OF PEOPLE FORGET. WITH MACHINE ACTIONABLE MELT DATA AND SMART METADATA. USE METADATA EVERY DAY. AN IMPORTANT (INAUDIBLE) ESPECIALLY TECHNOLOGIES WAS THE RISE OF INTERNET IN THE LATE '90s AND 2000 AND LATER. WHEN YOU GO TO YAHOO, GOOGLE, MICROSOFT NETWORK YOU FAVORITE CONTENT PROVIDER YOU GET LOTS OF METADATA ABOUT LOTS OF THINGSCH NEWS PRESS RELEASE, YAHOO ORING GOOGLE, THESE EAR NOT -- THEY ARE IN THE -- HOW DO THEY PULL IT OFF? HOW DO THEY GET INFORMATION TO YOU? IT'S REALLY WHERE YOU START TO THINK ABOUT THINGS LIKE XML TECHNOLOGIES AN STANDARDS. OBVIOUSLY THEY SOUGHT TO EXCHANGE INFORMATION WITH DIFFERENT DOMAINS. THEY NEED A LANGUAGE. AND -- ANY BIG PIECE OF THE SUCCESS OPT INTERNET IS XML TECHNOLOGIES AND RELATED STANDARDS. WE USE IT FOR MANY THING, NOT DOCUMENTATION BUT CONTEXT, FOR SURVEY DATA WE NEED TO KNOW WHERE IT COMES FROM, WHERE IT WAS PRODUCED. THE YEAR, THE TITLE, SAMPLING METHODOLOGIES. TO DISCOVER RESEARCH. DISCOVERY IS VERY IMPORTANT, WE SEARCH METADATA TO DISCOVER DATA BEHIND IT. PROMET AND ADVOCATE. WE PUBLISH METADATA TO SAY I HAVE SOMETHING HERE TO BE INTERESTED IN. CATALOGS, INFORMATION ABOUT CAUSE, ALL THOSE OTHER THINGS. DOCUMENTATION AS WELL, A BIG STORY AN AUTOMATION. II CANNOT EMPHASIZE THIS ONE MORE, THERE'S LOTS OF THINGS PEOPLE DO BY HAND MANUALLY EVERY DAY THAT CAN BE MORE EFFICIENT AND EASIER USING STRICT METADATA. WE EXCHANGE INFORMATION WITH EACH OTHER, WE NEED A COVER LANGUAGE AND STANDARDS. YOU MAY WANT TO DO WHATEVER YOU LIKE BUT AS SOON AS YOU TALK ABOUT SOMEBODY ELSE YOU WANT A LEVEL OF COMMUNICATION AND SAME LANGUAGE, SAME STANDARD, WITHIN AN AGENCY AS WELL WHEN YOU COMMUNICATE WITH EACH OTHER. SECURE AND PROTECT THAT'S AN INTERESTING, ESPECIALLY IN THE DATA WORLD. SENSITIVE DATA, METADATA IS A POINT (INAUDIBLE) DECISION, ACCESS POLICIES TOdPe METADATA THAT IS ESSENTIALLY CONTROL ACCESS DATA. IT'S ALSO IMPORTANT TO REMEMBER IF SOMETHING IS LOCKED DOWN BEHIND CLOSED WALLS LIKE SENSITIVE DATA YOU WANT METADATA OUT THERE BECAUSE IT WILL HELP PEOPLE DECIDE (INAUDIBLE), DO SRI ENOUGH INFORMATION TO SAY I CAN USE THE PUBLIC USE TRIAL OR TO A MORE SENSITIVE VERSION. RESEARCH (INAUDIBLE) TO SOLID DECISION ABOUT THAT. SO SHARE KNOWLEDGED AND SUCH. METADATA -- WE DON'T WANT TO LIVE IN A WORLD LIKE THIS. I DON'T -- DO I BUY A CAN, IS THIS FOR MY CAR OR (INAUDIBLE)? NUMBERS WITHOUT METADATA. THAT'S NOT REALLY GOOD. I WANT PINTO BEANS AN ACTUALLY I WANT TO COOK WITH CANOLA OIL. WE HAVE A LOT OF INFORMATION SO PEOPLE CAN CAN RESPONSIBLY USE THE DATA. NORMALLY TO GIVE ACCESSs TO IT TO USE IT IN THE RIGHT WAY. WHEN YOU JUST GIVE DATA WITH NOT ENOUGH METADATA PEOPLE WILL GUESS AN GUESSING IS DANGEROUS. PEOPLE GUESS DIFFERENTLY. WHAT ABOUT STATISTICAL DATA? THIS IS NOT ALL TRUE. MY BACKGROUND IS WITH SOCIAL DATA AND I REALIZE THAT ACTUALLY THE HEALTH DOMAIN IS BETTER SHAPE THAN OTHER PLACES I HAVE SEEN. BUT WE LIVE IN METADATA WORLD WHERE THE (INAUDIBLE)? YES AND NO. WHEN YOU LOOK AT STATISTICAL DATA THAT'S OUT THERE, THERE'S -- YOU GET MICRODATA, SOFTWARE, BUT THAT'S ABOUT IT. YOU GETTING A GRATED DATA, IT'S PUBLISHED IN HTML, EXCEL, BUT TELLS ME A LOT OF DIRECTIONALITY CONCEPTS OF THE TABLE. DOCUMENTATION PDF AND WORD, USEFUL AS SOFTWARE BUT NOT ON SITE OR SEARCH (INAUDIBLE). SO DISSEMINATED BUT NOT A LOT OF METADATA THAT'S SOMETHING WE WANT, EVERYBODY WANTS TO IMPROVE ON THAT. WHAT CAN WE DO? , OBVIOUSLY EVERY DOMAIN AS A METADATA IS A PROBLEM. BUT IT'S A UNIVERSAL PROBLEM. WHEN THE INTERNET CAME UP, BUSINESS TO BUSINESS, ALL THOSE COME UP WITH NEW TECHNOLOGIES SO THEY HAD TO TALK TO EACH OTHER. BEFORE THEY WILL POINT TO POINT INTEGRATION, WORKING FINE. WHEN THE INTE NET CAME OVER THAT HAD TO CHANGE. THEY HAD TO (INAUDIBLE) MISDEMEANOR AWAY. SO TECHNOLOGY CAME FROM THAT. THE PROBLEM IN THE STATISTICAL WORLD IS SLOW TO ADOPT THESE THINGSCH THEY HAVE BEEN AROUND 10, 15 YEARS BUT ADOPTION IS QUITE SLOW AND THAT'S REALITY. THAT'S THE WAY IT IS. SO (INDISCERNIBLE) JUST FOR USE, MORE AND BETTER METADATA BUT IT'S NOT THAT EASY TO DO. WE KNOW WE NEED TO ADOPT STANDARDS. THERE'S NO ONE STANDARD TO RULE THEM ALL, MANY STANDARDS, YOU WANT TO FIND STANDARD THAT COMPLIMENT AND WORK WELL WITH EACH OTHER, HERE NEXT A 5, 10, 20 YEARS. AND THAT INTEROPERATE. XML TECHNOLOGY WE KNOW WELL WHAT IT IS ABOUT AND WE KNOW WE CAN LEVERAGE THE TECHNOLOGY IS THERE, AS WELL AS FREE TECHNOLOGY THAT YOU CAN USE. CHANGING PRACTICES CHANGE AS WELL IS OUR BIGGEST PROBLEM. YOU CAN FIND THE BEST TECHNOLOGY, BEST STANDARD IF YOU DON'T MANAGE THE CHANGE, IF YOU DON'T INTRODUCE STANDARD RIGHT WAY IN YOUR ORGANIZE IT'S NEVER GOING TO WORK. SO REALLY, WHY DON'T WE DO IT? WE ALL WANT BETTER DATA. BETTER DATA QUALITY. MANY TIMES WE DON'T KNOW ENOUGH ABOUT IT. NOT A LOT OF INFORMATION AND KNOWLEDGE ABOUT IT. AND WE NEED TO TALK MORE AND INFORM MORE ABOUT XML AND METADATA STANDARDS AS WELL AS -- WE DON'T LIKE CHANGE. IT'S HUMAN NATURE, WE HAVE BEEN DOING THE SAME THING THE SAME WAY PAST 10, 15 YEARS AND WE DON'T LIKE THE CHANGE. WE HAVE SEEN SECURITY WHICH MEANS YOU NEED CHANGE MANAGEMENT PRACTICES. YOU NEED HIGH LEVEL OF SUPPORT, THE DECISION NEEDS TO COME FROM THE EXECUTIVE LEVEL. I HAVE SEEN A LOT OF INITIATIVES COMING FROM WELL INTENTIONED PEOPLE. IT WILL FAIL EXECUTIVE SUPPORT. YOU NEED STRATEGY, THINGS WE NEED TO KEEP GOING, ESPECIALLY STATISTICAL AGENCY YOU HAVE TO PRODUCE YOUR DATA EVERY MONTH, EVERY YEAR OR NINE, SO YOU CANNOT JUST STOP AND SAY OH, (INAUDIBLE) METADATA -- IT DOESN'T WORK THAT WAY. YOU HAVE TO FIND A STRATEGY TO BRING IT IN SLOWLY AND MAYBE IN 5, 10 YEARS DATA DRIVEN BUT MEANTIME YOU HAVE TO CONTINUE TO LOOK AT YOUR MANDATE WHICH IS PRODUCING DATA. IT'S NOT TO CHANGE EVERYTHING OVER NIGHT. TOOLS ARE NOT WELL EQUIPPED. PARTICULARLY LIKE IF YOU LOOK AT STATISTICAL SOFTWARE BEYOND DATA DICTIONARY YOU DON'T HAVE MUCH DATA AND THIS SOFTWARE CAN CANNOT CHANGE FOR 10, 15 YEARS. I REMEMBER LEARNING SAS IN THE '80s, PRETTY MUCH THE SAME SOFTWARE ON THE SERVER SIDE, BUSINESS INTELLIGENCE THINGS HAVE CHANGED. BUT IT'S STILL ONE FILE AND THAT IS A PROBLEM. SOME OF THE TOOLS WE USE TODAY WE CANNOT CHANGE THEM. SO THE (INAUDIBLE) IS TO SAY WE NEED TO (INAUDIBLE) AROUND THEM TO COMPLIMENT WHAT THEY DO OR PRESSURE THE VENDORS AND THE COMMERCIAL COMBINATION TO SAY WE NEED BETTER STATISTICAL PACKAGES. THAT'S HARDER TO DO. HOW MUCH DOES THIS COST? PRODUCING GOOD METADATA COSTS. NO QUESTION. BUT FIRST RETURN ON INVESTMENT IS QUITE GOOD. SUDDENLY BETTER DATA, YOU REDUCE (INAUDIBLE) THINGS GO FASTER. I FINE BASICALLY WHEN YOU BUDGET DATA (INAUDIBLE) THIS BUDGET WAS METADATA, EXTREMELY SMALL. SOMETIMES NON-EXISTENT. AND SOMETIMES THIS MAKE SENSE TO ME, IF I'M IN THE PUBLISHING COMPANY AND I WANT TO PUBLISH A BOOK HOW MUCH OF MY BUDGET I'M GOING TO PUT INTO PACKAGING MY BOOKK PUT IT ON THE WEB SO PEOPLE CAN READ IT. IT WILL BE A SIGNIFICANT AMOUNT OF MY BUDGET. WHEN I LOOK AT DATA PRODUCTION THE BUDGET GOES TOWARDS METADATA. IT'S OFTEN WAY TOO SMALL. I THINK THIS WHATEVER YOU DO THINK ABOUT METADATA AND PUJT IT. WHEN I LOOK AT THE U.S. STATISTICAL SYSTEM, $4 BILLION A YEAR INTO STATISTICS, I THINK (INAUDIBLE). WHAT IF ONE PORTION OF THAT GOES FORWARDS METADATA MAKES A HUGE DIFFERENCE IN TERMS OF TOOLS AND STANDARDS, HERB CAN USE IT. -- EVERYBODY CAN USE IT. SO BUDGETING FOR METADATA, I DON'T SEE FROM THE DATA PRODUCER. THEN DOES IT WORK? YES. I THINK THE INTERNET IS A GOOD (INAUDIBLE). THEY USE XML OVER TIME AND THAT'S HOW THE INTERNET WORKS. SO WHY CAN'T WE TAKE THE SAME TECHNOLOGY AND APPLY TO THE STATISTICAL WORK. THAT'S A REASON WHY NOT. A LOT OF PEOPLE ARE RISK AVERSE. WE'RE IN THE EARLY STAGE. WE NEED INNOVATORS TO GO OUT AND DEMONSTRATE IT WORKS. IMPORTANT ASPECT OF THAT. THERE ARE MORE REASONSCH THINGS CHANGED THE PAST TEN YEARS. MORE DATA IS PRODUCED AND DEMAND HAS (INAUDIBLE) DRAMATICALLY BECAUSE OF THE INTERNET. GLOBALIZATION IS AS WELL. WE NEED DATA FROM OTHER COUNTRIES. WE WANT DATA FROM THE WORLD. IF YOU LOOK TODAY THE POPULATION OUTWEIGHED DEVELOPED COUNTRIES, 27 TO ONE. IF WE DON'T GET DATA FROM OTHER PLACES IN THE WORLD HOW DO WE KNOW WHAT'S GOING ON? WE HAVE TO PLAN FOR THAT TODAY. ECONOMY IS THE SAME WAY. WE NEED ECONOMICAL DATA FROM ALL AROUND THE WORLD. GLOBAL SAIG IS AN IMPORTANT REASON TO DO EXCHANGE DATA AND TALK TO EACH OTHER. WE NEED TO (INAUDIBLE) COUPLE OF LANGUAGESCH TRANSPARENCY, I'M SURE YOU HAVE BEEN PUT UNDER PRESSURE WITH (INAUDIBLE). METADATA IS IMPORTANT TO THAT. SUPPORT OR GOING TO THE SEMANTIC WEB. IT REDUCE YOUR BURDEN ON COST. IF YOU PUT MORE INFORMATION OUT THERE, PEOPLE WILL NOT ASK THAT MAKE QUESTIONS, MAKE EVERYBODY'S LIFE EASIER. AND PRESERVING, SO WE NEED TO STORE INFORMATION IN A PROPER FORMAT. WE DON'T WANT TO STORE DATA AND IN TEN YEARS (INAUDIBLE). ASCII IS NOT ENOUGH. ISLE TALK ABILITY THAT. SO HIGH LEVEL METADATA. YOU PROBABLY KNOW THOSE THINGS BUT SOME ASPECT, IT'S GOOD TO TO TALK ABOUT AND REFRESH WHY WE WANT TO DO METADATA. THE SECOND POINTS WERE XML AND DDR. ARE FAMILIAR WITH XML. WHAT'S IMPORTANT TO KNOW, IT'S MORE THAN JUST METADATA. IT'S A SET OF TECHNOLOGY THAT ALLOWS YOU TO DO A LOT OF THINGS. YOU CAN CAPTURE INFORMATION, STRUCTURE IT AN MODEL IT TO SAY THIS IS THE WAY I WANT MY INFORMATION TO BE CAPTURED AND EXCHANGED. YOU CAN TRANSFORM IT, EXCEPT FOR PEOPLE LIKE ME AND DDI, YOU DON'T WANT TO (INAUDIBLE) IT'S TRANSFORMING THAT INTO SCRIPT, INTO TEXT. NATIVE TO XML WORLD. EXCHANGE. ARE WE UNDER ARCHITECTURE. YOU CAN SEARCH AND SEARCH IS QUITE IMPORTANT AND THE BEAUTY OF XML IS YOU HAVE LANGUAGES LIKE (INAUDIBLE). I CAN PUT IT IN XML DATABASE, I DON'T NEED TABLES AND RELATIONSHIPS AND INDEXES. THIS IS POWERFUL, DEVELOPMENT SOURCE AND MOST LARGE VENDORS ARE IBM OR MICROSOFT AS A NATIVE LANGUAGE. THE DDI HAS BEEN AROUND SOME TIME, IT OFFICIALLY WAS PUBLISHED IN 2000 VERSION 1 BUT FROM THE DATA ARCHIVE COMMUNITY AND CPSR, THAT COMES FROM THE 90s, 80s, 70s, FOR THE METADATA ON CARDS AND GOING TO AN XML STANDARD. THE DDI IS AN ALLIANCE MEMBERSHIP ORGANIZATION, PRETTY MUCH ANYBODY CAN BECOME A MEMBER, 35 OF THEM TODAY. SINCE WE KNOW FILES VARIABLES AND SIMILAR STATISTICS AND VALUABLE LABEL, ALSO EXTREMELY IMPORTANT WHAT WE DO, CONCEPTS, UNIVERSE, QUESTION, PRO-DENSE, POLICIES, GEOGRAPHY. IT REALLY WAS COMPRESSSIVE KNOWLEDGE AROUND THE DATA SET SO YOU CAN DISSEMINATE AND DO USEFUL THINGS WITH IT. TWOFOLDERS OF THE DDI. YOU MAY HAVE HEARD. ONE IS CODE BOOK, OTHER ONE LIFE CYCLE. USED TO BE NOBODY AS -- KNOWN AS DDI 2 OR VERSION 1 VERSION 2, 2.5 THAT WAS RECENTLY PUBLISHED A WEEK AGO. IT'S THE ONE AROUND FOR A LONG TIME AND A LOT OF PEOPLE ARE USING IT. IT WORKS GREAT FOR WHAT IT DOES, VERY MATURE. LOTS OF TOOLS AROUND D ID-2 AND USED IN MANY PLACE AROUND THE WORLD. IT HAS WEAKNESSES. IT WAS OUT OF THE ARCHIVE COMMUNITY FOR ARCHIVING. LOOKS LIKE SANG L SURVEY INSTANCE. IT WRAPS EVERYTHING IN ONE DOCUMENT. IF YOU HAVE A VARIABLE, YOU CAN DOCUMENT BECAUSE ALL THE INFORMATION IS SQUEEZED IN THE VARIABLE. IT'S THE REUSABILITY OF THINGS. IF YOU HAVE TO SAME QUESTION FIVE TIMES YOU DOCUMENT IT FIVE TIMES. SAME CLASSIFICATION LIKE YES, NO, USE BY 50 VARIABLES, DATA IS REPEATED 50 TIMES. IT WORKS WELL. SO IT'S NOT SOMETHING THAT NEEDS TO BE IGNORED. OBVIOUSLY FOR DATA PRODUCER IT WAS AT NO TIME RIGHT STANDARD. WHEN YOU START A SURVEY YOU HAVE A VARIABLE. YOU START THE CONCEPTS. WHICH IS WHERE THE DDI LIFE CYCLE CAME FROM WHEN YOU START DESIGNING IT MANY YEARS AGO. THIS IDEA TO SUPPORT FROM DAY ONE OF THE SURVEY TO ARCHIVING TO DISSEMINATION TO REPURPOSINGING OF THE RESEARCHER OUTPUTS. SO HIGH LEVEL OF REUSE, IF YOU LOOK AT DDI-3 AND YOU'RE FAMILIAR WITH THE DATABASE THAT'S WHAT IT IS. WENT THE OTHER WAY FROM DDI-2, MADE THEM REUSABLE APPROXIMATE DEFINE ONCE AN REUSE. THAT'S -- IT SOMETIMES IS HARD TO DO. NOT TRIVIAL BUT IT CAN BE POWERFUL IF YOU DO IT THE RIGHT WAY. THAT'S LOTS OF PLAINCATION, NOT JUST ARCHIVE, LOTS OF THINGS TO DO WITH DDI 3, YOU CAN DO BIOBANK, QUESTION BANK, YOU CAN MANAGE WORK FLOW. ONE OF THE MAIN DIFFERENCES AS WELL, IT CAME IN MANY DRIB CONTRIBUTORS. D DI IS ONE AGENCY, ONE PERSON COMPILING INFORMATION. DDI LIFE CYCLE BECAUSE OF EVERYTHING IS NICELY TAKEN APARTCH YOU CAN HAVE AGENCIES THAT MANAGE INDEPENDENTLY, EVERYTHING CAN BE REUSED ACROSS BUT IT IS MORE COMPLEX. THERE'S NO QUESTION. NOT A SIMPLE XML SPECIFICATION. IT'S ONE OF THE MORE COMPLY CATTED. IT HAS -- COMPLICATED IN THEM. YOU HAVE TO BE FAMILIAR WITH XML TECHNOLOGIES. AND IT REQUIRES MORE IT CAPACITY AND NOT THAT MANY TOOLS, IT'S VERY NEW, BEEN AROUND A FEW YEARS BUT YOU STILL DON'T HAVE A LOT OF TOOLS AROUND DDI-3. SO CODE BOOK PERSPECTIVE, EVERYTHING IS ASKs QUESTIONS, THEY'RE RESPONSIBLE FOR DOCUMENTS EVERYTHING. ALWAYS A LOT ABOUT THE SURVEY SO THERE MIGHT BE KNOWLEDGE THAT CANNOT MAKE IT TO THIS ON THE BOTTOM. THE LIFE -- YOU HAVE SEEN THE LIFE CYCLE AND SLIDE ABOUT IT. NOT EVERYBODY CAN USE AT ANY STAGE OF THE DATA PRODUCTION OR ARCHIVING LATER ON. THIS IS AN EXAMPLE HOW DDI SEES THESE THINGS. QUESTIONNAIRE WITH INDICATION MODULE AND IF REMEMBER WE'RE ASKING FOR (INAUDIBLE) WISH HERE, USING THAT QUESTION BROKEN DOWN INTO ONE PIECE, YOU HAVE CLASSIFICATIONS TO BE BROKEN DOWN AN MAINTAINED DIFFERENTLY, VALUE LABELS, FLOW LOGIC, INTERVIEWING INSTRUCTIONS, DIFFERENT LEVEL CAN BE AT THE QUESTION, CAN BE THE SECTION LEVEL OR WITHIN A GROUP. AT THE TOP, YET MODULES THAT CAN COME INTO CONCEPT AND CONCEPTS GO EVERYWHERE. YOU ALSO HAVE UNIVERSES, SO TAKE THIS APART AND -- DDI-3 IS ONE LIFE CYCLE, THESE BECOME REUSABLE COMPONENTS TO MANAGE IBD PEN DENLY AND REUSE AND REIDENTIFY. COMMON METADATA. THERE'S INFORMATION THAT GOES ACROSS THE SURVEY, GOES ACROSS THE SURVEY PROGRAM, EVEN ACROSS STANDARDS. ESPECIALLY CLASSIFICATION AN CONCEPT ARE THE COUNTRY CLASSIFICATION, STANDARD CLASSIFICATIONS. CONCEPT IS THE SAME THING. THIS GOES ACROSS CENTERS. SO THAT SHOULD BE MANAGED INDEPENDENT OF THE SURVEY. THIS IS A LIFE CHART, SURE YOU HAVE SEEN THIS BEFORE MANY TIMES BUT THE PRODUCTS ARCHIVE DISSEMINATION THE USUAL WAY LOOKING AT THINGS. YOU CAN USE DDI FOR A VARIETY OF THINGS ESPECIALLY NUMBER 3, ARCHIVING PRESERVATION WHICH IS VERY BOLE DDI LIFE CYCLE CODE BOOK TO DO THIS, AROUND THAT DATA SO I CAN PRESERVE IT AND MAKE IT ACCESSIBLE WITH A LOT -- ENOUGH INFORMATION AN INTERESTING THING TO TALK ABOUT IS LONG TERM PRESERVATION. IT'S ONE OF THE QUESTION OUT THIS MORNING. I REALLY LIKE TO TAKE DDI AS A LONG TERM PRESERVATION FORMAT. PROVISIONALLY WE EXPORT DATA TO ASCII BUT THEN YOU END UP WITH THAT DATA, IF YOU LOOK YOU HAVE A VARIABLE ON TOP AND NO INFORMATION. DDI NEXT TO IT, YOU HAVE SUFFICIENT INFORMATION NOT ONLY TO REFUTE THE ENTIRE DATA DICKNARY BUT TO KEEP KNOWLEDGE AROUND DATA. SO IN FIVE YEARS, 10 YEARS, 20 YEARS YOU CAN SEE (INAUDIBLE) XML AND YOU CAN ALSO THEN DO POPULATION AROUND IT TO REUSE THE DATA IN OTHER SOFTWARE. XML TRANSFORMATION TO GENERATE SCRIPTS AND OTHER PROGRAMS THAT LOOK FOR ASCI I DATA INTO VARIOUS PACKAGES. WHATEVER SOFTWARE MAY EXIST IN 10, 20 YEARS IN DATABASE, IN XL. THE ADVANTAGE IS BECAUSE IT'S TRANSFORMATIONAL STANDARD BASE THEY WORK FOR ALL THE DDIs. THEY BECOME REUSABLE AND WE HAVE TOOLS THAT DO THIS TODAY. BE THE BEAUTY OF THIS, YOU HAVE ASCI, XML, TEXT FORMAT, YOU CAN READ IN 20 YEARS AND A POWERFUL COMBINATION TO PRESERVE DATE FOR A LONG TIME. IF YOU WANT TO TRANSFER DATA BETWEEN DIFFERENCE PACKAGES THIS IS A NICE INTERVIEW. I CAN GO FROM MY DATABASE TO THIS TO SOMETHINGLESS. YOU CAN GO BACK AND FORTH WITH THIS. YOU HAVE MORE INFORMATION THAN YOU HAVE IN ANY PACKAGES. SO IT'S A IMPORTANT COMBO. F INTERESTING IF YOU THINK ABOUT LONG TERM PRESERVATION OR PROCESSING ACROSS PACKAGES. (INAUDIBLE) THAT ALLOWS THIS, THE MOST IMPORTANT IS OBVIOUSLY YOU KNOW WHAT THE DATA IS YOU CAN GET ACCESS DISCOVERY IS INFORMED (INAUDIBLE) WHEN YOU GO TO DATA PRODUCERS SOMEHOW EVERYBODY IN THE WORLD MAKES THE EXCEPTION THAT (INAUDIBLE). IT'S OFTEN NOT THE CASE. IF I'M IN EUROPE AN ASIA I MAY NOT KNOW THE SURVEY EXISTS. SO I NEED (INAUDIBLE) DISCOVER THE DATA AND KNOWING THAT I WANT THIS PARTICULAR SURVEY FOR THIS PARTICULAR YEAR BUT TYPICALLY AS RESEARCHERS WE LOOK AT DATA BY CONCEPT AND WHAT (INAUDIBLE) YOU UNDERSTAND CONCEPT VERY WELL. YOU UNDERSTAND CLASSIFICATION VERY WELL. WHICH AS I SAID VERY FUNDAMENTAL TO A DATA MANAGEMENT SYSTEM. THAT'S REALLY RESEARCH THAT WE LOOK FOR DATA. SO TO MAKE THE DATA AVAILABLE FOR A SEARCH ENGINE, (INAUDIBLE) BY CONCEPT. SO THAT'S QUITE IMPORTANT FOR THE DISCOVERY OF THE DATA. WE MENTION THAT. YOU CANNOT IMPORT EXPORT TRANSFORMATION. COMPARABLE, SOME TAKE YOUR DATA AND DO THINGS YOU DIDN'T THINK ABOUT. DDI LIFE CYCLES AS COMPARATIVE -- COMPATIBILITY MODULE ALLOWING A USER TO ASSAY I'M COMPARING A START 2 COMPARABLE AND THEY'RE COMPATIBLE WITHIN THE CONTEXT OF MY RESEARCH, SOMEBODY ELSE COULD SAY THE SAME BECAUSE DIFFERENT CONTEXT. VARIABLES FOR FILES, FOR QUESTIONS, FOR DIFFERENT TYPE OF THINGS. IT ALSO HAS THE RIGHT METADATA FOR LINKAGES. (INAUDIBLE). IMPORTANT FOR YOU, LIEU R YOU LOOKED A LOT OF THIS THE DDI PRODUCTION BUT IT'S RELEVANT YOU CAN USE IT FROM DAY ONE AND NO IT DOESN'T DDI LIFE CYCLE, YOU CAN CAPTURE COMMON METADATA, WE TALKED ABOUT THAT CLASSIFICATION, REUSABLE THINGS ACROSS SURVEY, PROGRAM OR ACROASES AGENCIES. YOU CAN USE DPLAISCATION. CLASSIFICATION OF DISEASES, THEY CAN BE DEFINE AND REUSED BY ANYBODY. WE KNOW THAT THAT IS HOW WE IMPROVE DATA QUALITY, TIMELINESS, PROCESSING, WORK FLOW MANAGEMENT. I LIKE DOCUMENT SURVEY, YOU WANT TO DOCUMENT A T THE TIME YOU DO SOMETHING. IF I DON'T DOCUMENT MY CODE WHEN I WRITE IT I WON'T IN TWO WEEKS SO YOU WANT TO DOCUMENT AS YOU G ALONG IF YOU DOCUMENT LATER YOU FORGET OR YOU DONE DO IT. BSBPN, THE GENERAL STATISTICAL BUSINESS PROCESS MODEL WHICH IS REALLY DEVELOPED BY STATISTICAL AGENCIES AROUND THE WORLD, A TEMPLATE FOR DATA PRODUCTION. THERE'S INFORMATION ON THE WEB IF YOU'RE INTERESTED BUT YOU MAINLY WANT TO KNOW IF YOU'RE INTERESTED (INAUDIBLE) SO THEY ARE MAPPING BETWEEN THE TWO, LOOK AT DDI AND THERE ARE PEOPLE INVOLVED IN (INAUDIBLE) VPN, COULDN'T MAKE IT TODAY. HE'S KNOWLEDGEABLE ABOUT THIS AND VERY IMPORTANT TO (INAUDIBLE). THERE ARE STANDARDS (INAUDIBLE) AND THIS WORKSHOP WE HAVEN'T MENTIONED THAT MUCH BUT IF YOU LOOK AT AGGRAVATED DATA, TIME SERIES AN HIGHER LEVEL DATA, STANDARD COMPLIMENT DDI QUITE WELL, THEY HAVE A LOT OF -- DESIGNED TO WORK TOGETHER. AND IF YOU LOOK AT AGGRAVATED DATA PUBLISHED TABLES YOU WANT TO BE ABLE TO TRADE THAT WHICH MICRODATA IS USED TO PRODUCE THESE TABLES AND COMBINING (INAUDIBLE). TOOLS. (INDISCERNIBLE) YOU HAVE BEEN LOOKING AT THIS QUITE A BIT AND YOU KNOW YOU CAN YOU GO ACROSS DIFFERENT PROGRAMS. SO I THINK THIS IS A LOT OF (INAUDIBLE) TOPICS. THE IMPORTANT OF THE AREA CAPTURE IS SOMETHING THAT AGAIN I MENTIONED BECAUSE A LOT OF TIMES I SEE THAT DATA BEING HANDLED DATA ARCHIVES AND DOCUMENTED AFTER THE FACT IMPORTANT TO REALIZE THE EARLIER YOU'RE GOING TO THE LIFE CYCLE, THE MORE YOU KNOW. KNOW MORE ABOUT THE DATA. Z YOU GO DOWN INTO THE DATA ARCHIVE, THIS LABEL -- BUT A LOT OF KNOWLEDGE WE'LL NEVER HAVE. YOU WANT TO DOCUMENT ALL INFORMATION AS SOON AS POSSIBLE. SO WHEN YOU GET TO THE RESEARCHER, INCREASE YOUR KNOWLEDGE AROUND THE DATA. DOCUMENT EARLY CAPTURE OF DATA IS REALLY IMPORTANT. THE NIH IS QUITE SUPPORTIVE OF DDI THROUGH A FEW GRANTS SO ACTUALLY THREE OF THEM. REMIND THERE'S ALREADY SOME LEVEL OF CONNECTION BETWEEN NIH AND THE DDI TOOLS BEING PRODUCED NOTABLY TO LEVERAGE METADATA OF STATISTICAL DATA AND SPECIFICS. THERE ARE DOMAIN SPECIFIC STANDARDS, NOT ONE THAT WILL DO IT ALL. D DI MIGHT WORK WELL FOR DATA PRODUCTION F YOU GO INTO AGGRAVATED DATA IT'S PROBABLY NOT GOING TO DO -- YOU WANT TO DO THE STANDARD -- IF YOU'RE DOING SEMANTIC WEB TECHNOLOGY YOU NEED SOMETHING ELSE. IT'S IMPORTANT NOT TO LOOK AT ONE SET OF STANDARDS THAT CAN CAN COMPLIMENT EACH OTHER. YOU CAN DO THAT YOU CAN DO NICE COMPLETION AND DO INTERESTING THINGS. REDUCE THE BURDEN, AUTOMATE THE PROCESS, BETTER DATA QUALITY. I DO TALK ABOUT DDI (INAUDIBLE) THAT WEFN INVOLVED WITH SO I ENCOURAGE YOU TO LOOK AT THE STANDARDS. IF YOU LOOK AT THE U.S. WITH A HIGHLY FEDERATED STATISTICAL SYSTEM, WE DO NEED CENTER. A LOT OF COUNTRIES I WORK WITH, IT'S (INAUDIBLE) AND WE HAVE ONE PLACE TO GO IN EUROPE. WE HAVE OVER 120 AGENCIES DOING STATISTICS. NOT THE MAJOR ONE. TO COMMUNICATE WE NEED STAN CARDS, DATA IN DATA DOT GOVERNOR WE NEED STANDARD XML TECHNOLOGY. SO (INAUDIBLE) METADATA, A FEW IDEA ANT DDI AND WHAT YOU CAN DO WITH DDR. I WILL TURN NEXT TO (INAUDIBLE) WE WILL TALK ABOUT WHAT THEY'RE DOING AT CORNELL WITH CENSUS DATA. AND THEIR (INAUDIBLE) PROJECT MOST RESEASONLY. A FEW MORE BULLET POINTS ABOUT ONGOING PROJECTS AND INITIATIVES AROUND DDI THAT WE ARE AWARE OF OR INVOLVED WITH. THANK YOU. >> ANYONE HAVE ANY QUESTIONS HERE? >> WHEN PEOPLE EXCHANGE MONEY WHEN YOU DO A TRANSACTION YOU LOSE SOMETHING. WHEN SOMEONE WANTS TO TAKE A TEXT FROM ONE LANGUAGE TO ANOTHER AND BACK TRANSLATE IT AND FORWARD TRANSLATE AGAIN, IT DOESN'T TAKE CYCLES BEFORE YOU LOST WHAT YOU STARTED WITH. WHAT ASSURANCES DO WE HAVE, YOU MADE THE STATEMENT THATK GO BACK AND FORTH AND MOVE FROM ONE SYSTEM TO ANOTHER THAT WE MAINTAIN THE INTEGRITY AND VALUE OF THE DATA. >> INSURANCE, PROBABLY NONE. DEPENDS ON IMPLEMENTERS. I THINK IF YOU GIVE DDI A CENTRAL REFERENCE YOU PROBABLY HAVE MORE INFORMATION IN THE PACKAGES THAT YOU HAVE. IT DEPENDS ON HOW YOU IMPLEMENT. I DON'T THINK IT'S -- YOU SHOULD TAKE DATA PUT INTO STATISTICAL PACKAGES AND COME OUT OF THERE, YOU LOSE METADATA BECAUSE THEY DON'T HAVE AS MUCH INFORMATION. BUT YOU CAN IMPLEMENT IN A WAY WHERE YOU MINIMIZE THAT LOSS AND YOU CAN KEEP THE INFORMATION THAT YOU NEED TO CONTINUE TO PROCESS. BUT THE INSURANCE COMES FROM THE IMPLEMENTATION, NOT FROM THE STANDARD. >> I WOULD LIKE TO THANK DR. HIRSCHFELD AND JAY GREENFELD FRR THE INFORMATION TODAY. -- FOR THE INFORMATION TODAY. PASCAL LEFT OFF WITH A SUMMARY, I'M CO-PI ON A 3 NILDZ NSF PROJECT THAT TALKS THESE POINTS. I UNDERSTAND I NEED MY TIME SHORT SO YOU CAN HAVE TIME TO WRAP UP. THE QUESTION IS NSF NCRN PROJECT WE HAVE ONE OF 8 NODES. THE OFFICIAL TITLE AT THE TOP, WE'RE ABOUT INTEGRATING RESEARCH SUPPORT, TRAINING AND DOCUMENTATION. MY PIECE IS THE DATA DOCUMENTATION PART OF THE PROJECT. THAT'S THE BULK OF THE MONEY. I'M NOT GOING TO TALK ABOUT THE FIRST TWO PARTS OF THE PROJECT. 8 NODES ARE FUNDED WHICH RANGE BETWEEN $1.2 MILLION AND $3 MILLION EACH. THE VAIERTORS THERE FROM CORNELL ARE LISTED. SPECIFICALLY REBILLING WHAT WE'RE CALLING, THIS IS THE WORLD'S WORST ACRONYM, NOT QUEUE QUEUE COUPLER, THE -- CUCUMBER. WE ARE ABOUT IN THIS PROJECT HELPING SOLVE THE PROBLEM OF SOCIOECONOMIC DATA AND OFFICIAL STATISTICS. THAT HAVE A NEED FOR CONFIDENTIALITY RESTRICTS IN PRIVACY AND HOW IT CONFLICTS WITH THE NOTION THAT WE'RE CONDUCTING SCIENTIFIC RESEARCH. HEALTH DATA HAS THE VERY SIMILAR PROBLEM, I THINK MY INITIAL QUESTION THIS MORNING WAS DOES THE NCS PLAN ON RELEASING PUBLIC USE FILES? THE ANSWER WAS YES, THAT DENOTES THERE WILL BE PRIVATELY RESTRICTED VERSIONS OF DATA AVAILABLE AS WELL. THAT IS A REAL PROBLEM, WE SUBMITTED A PROPOSAL TO ADDRESS THIS PROBLEM AND I'LL TALK ABOUT THAT. THE BACKGROUND OF THIS IS, PUBLIC USE FILES IN THE SOCIAL SCIENCES FOR A LONG TIME. AND A TON OF RESEARCH COMES FROM THAT, MY BACKGROUND I SPENT 20 YEARS AT MINNESOTA WORKING ON THE PROJECTS WITH HARMONIZED CENSUS DATA. MY LIFE WAS BUILT ON PUBLIC USE FILES FOR MANY YEARS. THEY'RE TERRIFIC. BUT TURNS OUT THAT THE RESEARCH QUESTIONS ONE ADDRESSES ARE LIMITED OR TEND MUCH MORE LIMITED WITH PUBLIC USE FILES BECAUSE YOU DON'T HAVE SPECIFICITY THAT YOU HAVE IN RESTRICTED CONFIDENTIAL DATA FILES. RESEARCHERS NEED ACCESS TO THE FILES. THEY DEMAND IT. THE QUESTIONS THEY CAN ADDRESS, AND THE QUESTION THEY ASK, THE DIFFERENCE DATA SETS THEY PUT TOGETHER ALLOW YOU TO DO MUCH MORE INTERESTING AND IN DEPTH COMPLICATED WORK. THE FUTURE IS QUITE CLEAR. THE PROBLEM IS THEY DON'T TEND TO LEAVE BEHIND A SCIENTIFIC TRAIL THAT IS REPRODUCIBLE. WHAT HAPPENS, A RESEARCHER WILL SUBMIT A PROPOSAL ACCESS AT CORNELL, (INAUDIBLE) RUNS A SIMILAR SHOP AT NORC AND RESEARCHERS GET ACCESS TO THE DATA, THEY GO INTO THE ENCLAVE, THEY DO THE RESEARCH, THEY TAKE OUT MINIMAL INFORMATION ALLOWED TO PUBLISH THE RESULT IN A JURY ROOM AND IT'S HARD FOR ANOTHER RESEARCHER TO SAY I WANT TO GET THAT DATA TO TEST HYPOTHESIS AND REPLICATE THE RESULTS. THEY CAN'T DO IT. THAT'S A HUGE PROBLEM IF WE'RE CALLING OURSELVES SCIENTISTS. WE PROPOSE THAT THAT WAS THE UNS LYING MISSION UNDERLYING POINT WE MADE IN OUR PROPOSAL TO NSF. SO OUR CCBMR, WE ARE DEALING WITH A PROBLEM, FACILITATING ACCESS TO DETAILED METADATA ON RESTRICTED ACCESS FILES OUTSIDE THE RDC. AS WELL AS PUBLIC USE DATA SETS. IT'S IRONIC, THE CENSUS BUREAU HAS A TERRIFIC DATA CENTER NETWORK PROVIDING ACCESS TO THIS THESE RESTRICTED FILES BUT HARD IF YOU'RE NOT INSIDE TO KNOW WHAT IS AVAILABLE AND THAT I WANT TO ANSWER THIS QUESTION DO YOU HAVE DATA AT THIS LEVEL OF SPECIFICITY, AGE BREAKDOWN BY YEAR OR FIVE YEAR AGE GROUPS? AND IT'S HARD THE TO TELL WHAT'S THERE. SO HIT AND MIS, VERY INEFFICIENT. THEY CAN'T TELL WHAT'S AVAILABLE. IN MY DAILY WORK I ROUTINELY SEE RESEARCHERS AT THE CORNELL RDC LEAVING RESTRICTED ENVIRONMENT, WALKING DOWN THE HALL, SITTING AT THIS REALLY POOR WORKSTATION OUTSIDE MY DOOR, A PUBLIC WORKSTATION IN MIDDLE OF AN OFFICE, AND LOOKING UP PUBLIC DOCUMENTATION THAT I WAS STUNNED WHEN I REALIZED IT'S NOT AVAILABLE IN THE RDC. YOU THINK THAT BOY, THEY HAVE ACCESS TO EVERYTHING IN THERE. FROM THE CENSUS BUREAU. YOU CAN'T GET TO (INAUDIBLE) FROM INSIDE A CENSUS RDC. SO IT'S SORT OF THE GOLD STANDARD HARMONIZED CENSUS DATA, U.S. AN INTERNATIONAL DATA, YOU CAN'T GET ACCESS STO THAT DOCUMENTATION FROM WITHIN. BUILD A FACILITY FOR 4,000, BUT MORE IMPORTANTLY A TOOL KIT TO ALLOW OTHERS THE SAME THING TO DO WITH OTHER METADATA AND THEN WE KEEP TRACK OF WHAT METADATA IS PUBLIC, WHAT IS PRIVATE, IF SOMETHING IS ON THE OUTSIDE WE'LL BRING IT MANY AND KEEP TRACK OF THE FACT THAT IT WAS PUBLIC. SO IT CAN GO BACK OUT AGAIN AND IF SOMETHING IS CONFIDENTIAL AND RELEASED WE'LL TRACK THAT TOO AND MAKE IT VAIL TO BELIEVE THE PUBLIC. BUT WE'LL HAVE A HANDLE WHAT'S PRIVATE AND NEVER RELEASE THAT TO THE PUBLIC. IN ADDITION ENVIRONMENT THERE'S METADATA THAT CAN'T BE MADE PUBLIC. YOU CAN'T ACKNOWLEDGE THAT WE ASKED THIS QUESTION AND HANDLE IT IN THIS WAY, THAT WILL GIVE AWAY INFORMATION THE BUREAU IS NOT COMFORTABLE WITH. THIS PROJECT WE'RE GOING TO TRY TO EXPAND THE NOTION OF WHAT MANY OF IT MEANS TO INCLUDE USER GENERATED COMPONENTS, NOTES, PROGRAM, ET CETERA. THIS TAKES BEYOND IN THE SOCIAL SCIENCES THE SORT OF STANDARD SOURCES WHAT METADATA MAYBE. AS MUCH AS POSSIBLE, DDI SDMX. WE OOH GOING TO MODIFY THINGS, WE'RE CONNECTED TO THE DDI EFFORT. A QUICK EXAMPLE WE MADE THIS UP IN THE PROPOSAL, IF WE'RE JUST GOING TO SIMPLY TAG A VARIABLE, YES, YOU CAN DISCLOSE THE FACT THAT THIS VARIABLE IS DECLOSABLE THE ZERO RESPONSE TO DISCLOSABLE BUT YOU CAN'T ADMIT BILL GATES HAS 345,878 IN SEEIATE M L BECAUSE YOU KNOW WHETHER -- SEATTLE BECAUSE YOU KNOW WHO THAT WAS. THAT'S A SIMPLE EXAMPLE TO SHOW YOU, SURE YOU UNDERSTAND THE POINT. HERE IS JEREMY WILLIAMS IN THE BACK. ENTERPRISE APPLICATION. THE LINE DOWN THE MIDDLE IS THE DISTINCTION BETWEEN PUBLIC SIDE OF THE DATA SET AND PRIVATE SIDE OF THE DATA SET AND JEREMY, DO YOU WANT TO TURN YOUR MICROPHONE ON AND GIVE US A MINUTE ON THIS? >> SURE. SO THE BASIC IDEA IS TWO ZONE, RESTRICTED AND PUBLIC. THE IDEA IS THERE'S A SIN CROW IN THISTY BETWEEN -- SIN CROW IN THISTY BETWEEN THE TWO. PRIVATE CAN'T COME FROM THE PUBLIC SIDE AND FROM THE PUBLIC SIDE, IT WILL ENRICH THE RESTRICTED ACCESS METADATA SO THEY DON'T HAVE TO GO TO THE UGLY WORKSTATION OUTSIDE BILL'S OFFICE. WE'RE TAKING EXISTING METADATA FROM DISPARATE SOURCES BRINGING THROUGH -- I CAN'T QUITE READ FROM HERE BUT THAT'S THE NORMALIZATION TIER. BUT THE IDEA, THERE ARE A NUMBER OF LAYERS THAT BRING THE DATA FROM DISPARATE SOURCES TO A CORE. SO WE WILL BE ESTABLISHING A METADATA STRUCTURE BASED ON DDI TO FACILITATE THE SYNCHRONIZATION BETWEEN THOSE TWO. SIN CHRONICITY IT PROBABLY WON'T LOOK LIKE IT BUT P P SHOULD LOOK FAMILIAR WHEN WE GET BECAUSE THERE'S USERS OF IPPMS DATA AND THERE'S NOT MUCH DIFFERENCE. SO PUBLIC ONLY HAS ACCESS TO WHAT'S PUBLIC AND INTERNAL WILL HAVE ACCESS TO PUBLIC AND PRIVATE INFORMATION. THE PROPOSAL WE HIT, WE HIT ALL THOSE BULLET POINTS THAT PASCAL FINISHED WITH. THIS IS RELEVANT TO NACIAL CHILDREN'S STUDY ON CONFIDENTIAL DATABASE AS WELL AS LONGITUDINAL DATA. HERE IS A LIST FROM JEREMY IVERSON, THE COLLEAGUE WHO COULDN'T ATTEND. OTHER PROJECTS ARE LEVERAGING D ID. PASCAL IS INVOLVED IN THE IHSN PROJECT. THE CANADIAN CENSUS RDC ENVIRONMENT ALREADY USES DDI UNDERNEATH. THE GERMAN BUREAU OF STACKS, VERY COOL PROJECT IN EUROPE. TRYING TO BRIDGE BOUNDARIES AND LAWING WITH ACCESS TO CERTAIN KINDS OF DATA IN EUROPE. AND POTENTIALLY U.S.. ACCESS TO SENSITIVE DATA. THAT USES DDI AS WELL. HERE IS A COUPLE OF LONGITUDINAL SURVEYS IN DDI. I'M NOT FAMILIAR WITH MIDAS. PASCAL, DO YOU WANT TO COMMENT? HE WAS AT ONE WORKSHOP WE WERE AT. THE WISCONSIN LONGITUDINAL STUDY AS WELL. THOSE ARE BOTH VERY, VERY -- LONG-TERM STUDIES, LONGITUDINAL DATA AND THEY HAVE BEEN PARTICIPATING THE LAST COUPLE OF YEARS IN EUROPE IN DDI AND LONGITUDINAL DATA AND THEY'RE -- THE BROAD UK COHORT STUDY APPROXIMATE APPLICABLE TO THE US. THE PANEL SURVEY DEA IS ALSO LONGITUDINAL AND THEY'RE USING DDI. I WILL CLOSE TODAY WITH A PICTURE OF A POSTER IMAGE THAT I WAS ONE OF THE PRESENTERS AT ON LEFT YOU HAVE THE LIFE CYCLE OF SOCIAL SCIENCE RESEARCH DEA. I GET TIRED OF THE LIFE CYCLE GRAPHIC YOU HAVE SEEN 20 TIMES TODAY. I LIKE THIS ONE. RESEARCH DUE TO MANAGEMENT IS A BIG PART OF WHAT I DO AT CORNELL. NIH ARE INCREASE LIG BEHIND IT. IT PUT THE GRAPH TOGETHER. YOU START THE RESEARCHER HAS AN IDEA, DO PROCESSING AND PUNBLYCATION. THIS IS ALL RESEARCH DATA MANAGEMENT, I TRY TO PUT A FOLDER AROUND IT TO MAKE IT COOL. THE MAIN THING IS TO BRIDGE BETWEEN WORK THAT YOU'RE DOING WITH DATA AND HOW THE METADATA ALLOWS SUBSEQUENT SEARCH AND DISCOVERY. I'M PARTIAL TO THIS GRAPHIC, BETTER LOOKING THAN THE OTHER ONE BUT WE PRESENT THIRD DEGREE IN CONTEXT DATA OUT THERE, YES DON'T NEED TO WORRY IN SOME CASES PUTTING DISQOFERRY TOOLS IN PLACE BECAUSE THERE ARE SOURCES LIKE ALPHA AND GOOGLE AND EVERYBODY ELSE THAT WILL BE GLAD TO SUCK UP PUBLIC METADATA AND PROVIDE ACCESS TO SEARCH AND DISCOVERY. SO THAT IS JUST THAT POSTER. I THINK THIS MAY -- THAT WAS THE LAST SLIDE. AND TIM, DO YOU WANT TO SAY ANYTHING ABOUT THE ENCLAVE? >> NO QUESTIONS. >> WE ARE RUNNING LATE ON TIME. >> ARE THERE ANY QUESTIONS? >> ALWAYS NICE TO GO LATE BECAUSE THEN THERE ARE NO QUESTIONS IF YOU'RE THE LAST SPEAKER. >> WE'RE GOING THE TARGET ENDING NO LATER THAN 5 O'CLOCK AND IF I CAN I WILL STRIVE TO FINISH AHEAD OF TIME, AHEAD OF SCHEDULE AND UNDER BUDGET. THOSE ARE THE WAYS WE TRY TO WORK. COULD WE HAVE THE SCREEN ON THE LEFT RAISED UP? APPS IS WE CAN, JUST A QUESTION OR TIME. MAYBE WE HAVE TO DO BOTH. DO BOTH. ANYWAY, FEW OF US WERE TALKING IN THE BREAK AND CAME TO SOME RATHER QUICK CONSENSUS HOW WE WERE GOING TO SUMMARIZE WHAT WE HEARD. CAN WE HAVE THE ROOM LIGHT, PLEASE. WE HEARD THE SAME THING FROM EVERYONE, TO REPLACE WHAT WE'RE DOING IN CONTEXT. IS I COULD SAY THESE WERE WRITTEN AT 8 THIS MORNING BUT THAT WOULD BE OVERSTATING IT. NOT SURE THE BILLING WAS OPEN AT 8 THIS MORNING. ANY CASE, WHATEVER IS DONE HAS TO BE DONE IN THE CONTEXT. YOU NEED TO UNDERSTAND WHAT IS BEING ASKED AND RESOURCES AVAILABLE AND HOW CAN YOU DEFINE THE TERMS AND CONCEPTS. WE MUST HAVE TRANSPARENCY, WHATEVER WE DO IN THE NATIONAL CHILDREN'S STUDY, WE'RE COMMITTED TO TRANSPATIENCE SI -- TRANSPARENCY OF OUR PROCESS SO SOMEONE CAN KNOW WHAT WE'RE DOING AND WHY WE'RE DOING IT. THE ONLY WAY TO REACH LONG TERM GOALS. IN ADDITION, SPECIFICALLY FOR METADATA AND INFORMATICS OPERATIONS, WE NEED TO BE SCRUPULOUS ABOUT VERSIONING BECAUSE THERE'S RESOURCES INVESTED. AS WE WERE HEARING FROM SEVERAL FOLKS TODAY IS HUMAN RESOURCES. PEOPLE TAKING TIME. AS WE OFTEN SAY IN THE NCS PROGRAM OFFICE TIME IS NOT OUR FRIEND. SO WE CAN ADDRESS WHAT WE CODO AND SAVE THE TIME AN RESOURCES IS IF WE FOCUS ON THE COMPUTABILITY OF OUR RESOURCES. WE HEARD HOW WE CAN AUTOMATE AND MAKE MACHINE READABLE MANY TYPES OF DATA. SO WE NEED TO FOCUS ON EACH OF THESE THEMES SO THAT WE GET TO A POSITION TO LEVERAGE WHAT WE ARE DOING WITH WHAT OTHER PEOPLE ARE DOING. IF WE ARE ABLE TO LEVERAGE, THEN LEVERAGING MEANS THAT YOU HAVE PARTNERS. PARTNERS IS WHAT WE WERE TRYING TO BRING TOGETHER TODAY. SOME OF YOU HAVE WE MAY NOT BE PAR NERS WITH THE NATIONAL CHILDREN'S STUDY BUT WE ENCOURAGE THAT. AND SOME OF YOU ARE PARTNERS. I THINK ALL OF YOU FEEL COMFORTABLE SAYING CATEGORICALLY ALL OF YOU THAT ARE PARTNERS ARE WILLING PARTNERS AND FIND IT A TRUE PARTNERSHIP. THAT WE INTERACT. THAT IT'S NOT JUST THE RELATIONSHIP OF US, THE FEDERAL GOVERNMENT, HIRING PEOPLE AS CONTRACTORS OR CONTRACTORS WORKING INDEPENDENTLY BUT WE WORK TOGETHER AND WE HAVE SHARED SUCCESS AND SHARED FRUSTRATIONS. IF WE BILL THE PARTNERSHIPS, THE QUESTION IS WITH WHOM THIS IS WHERE I THINK I WOULD LIKE THE CLOSE AND SAY WE DON'T KNOW EXACTLY WHO OUR PARTNERS, JUST LIKE WE DONE KNOW YET THE DETAILS OF THE STUDY NOR DO WE KNOW WHO IS ENROLLED IN THE STUDY BECAUSE MOST OF THEM HAVEN'T BEEN CONCEIVED YET. SO WE ARE GOING TO BE OPEN AN REACCEPTTIVE TO PARTNERSHIPS. I HOPE THAT EVERYONE IN THIS ROOM AN EVERYONE LISTENING ON THE BROADCAST TODAY OR WILL SEE THE VIDEOCAST IN THE FUTURE REALIZES THAT WE ARE LOOKING FOR PARTNERSHIPS TO ACHIEVE THESE GOALS AND THERE IS A -- THERE IS A CYCLE TO ALL OF THIS. SO IT REINFORCES. WHEN WE REINFORCE THIS CYCLE WE'LL TURN THE NATIONAL CHILDREN'S STUDY INTO SOMETHING WHICH WILL HELP EVERYONE. WE SAY THAT WE ARE EXPECTING PARTICIPANTS THAT THEY WILL DONATE THEIR TIME, AND DONATE THEIR DATA NOR TO HELP THE HEALTH OF FUTURE GENERATIONS AS WELL AS THEIR OWN. WE HOPE MEDICAL INFORMATICS ENDERS THAT WE HELP NOT ONLY THE NATIONAL CHILDREN'S STUDY BUT OTHER STUDIES IN THOSE FUTURE STUDIES TOO. SO I WOULD LIKE TO ASK IF ANYONE IN THE ROOM WOULD LIKE TO MAKE ANY ADDITIONAL OBSERVATIONS OR COMMENTS BEFORE WE WRAP UP. SO I'LL START OVER HERE WITH JAY GREENFIELD. >> I FOUND THIS MEETING WAS VALUABLE. IT BROUGHT FRESH PERSPECTIVES TO THE WORK WE WERE DOING. THERE ARE SPECIFIC AREAS AND CONTRIBUTIONS THAT FOLKS HERE CAN MAKE. FOR EXAMPLE WITH THE OPERATIONAL DATA ELEMENTS THERE'S ALREADY SOME PRELIMINARY WORK LOOKING AT HOW SOME OF THOSE DATA ELEMENTS CAN BE HARMONIZED WITH HL-7, AND BRIDGE MODEL AND THERE ARE A LOT OF COMMON DATA ELEMENTS BETWEEN THE TWO. IF THAT SEEMS TO BE A VALUABLE WAY TO PROCEED, WE SHOULD DO THAT. WE HAVE ALSO BEEN ENRICHED PREVIOUSLY BY OTHER STANDARDS, SPECIFICALLY C DISC ODM. THE RELATIONSHIP BETWEEN THE C DISC ODM STANDARD AND WHAT THE DDI ARE DOING VERY CLOSE. I REMEMBER NOW THAT WITH THE INSTRUMENTS ELEMENTS, ONE THING WE DID EARLY ON WAS WE INCLUDED MANY OF THE ELEMENTS OF THE C DISC ODM, SO WE CAN CONTEXTUALIZE THE DATA ELEMENTS IN EVENTS AND STUDIES APPROXIMATE SO FORTH AND SO ON. SO WE HAVE ALL IN THE ROOM DIFFERENT STANDARDS AND THOUGHT HOW TO HARMONIZE THOSE STANDARDS. I DON'T KNOW HOW WE GO OR WHERE WE GO FROM HERE IN TERMS OF BUILDING A SYSTEM BUT I THINK THE INPUTS HAVE BEEN EXTREMELY VALUABLE. >> I FOUND IT VERY VALUABLE. I LEARNED A GREAT DEAL ABOUT THE PROCESS AND THE DIFFERENT FRAMEWORK FACILITY. I WAS VERY THANKFUL TO BE HERE. ARE PEOPLE GOING TO GET PDFs OF SLIDES? IS THAT VAIL ?BL >> ABSOLUTELY. WE WILL HAVE ALL THE SLIDES AND WE'LL POST THEM ON THE NCS WEBSITE. I THINK WE HAVE I WAS LOOKING FOR DARCY SMITH OR SOMEONE FROM CYCLE SOLUTIONS BUT WE HAVE ALL THE SUBMITTED SLIDES TO OUR LOGISTIC SUPPORT CONTRACTOR AND WE'LL MAKE SURE THAT EVERYONE THAT REGISTERED FOR THE MEETING WILL GET A PACKET. AND WE'LL POST THEM TOO. >> I TOO FOUND IT A VERY VALUABLE AND WE HAVE A PRETTY ECLECTIC GROUP OF ORGANIZATIONS REPRESENTED HERE. I'M WORKING CLOSELY WITH TERMINOLOGY HARMONIZATION AN METADATA EFFORTS. ONE KEY THING REALLY IS„] LEVERAGE, THERE'S CLEARLY A LOT UP HERE TO USE. DIFFERENT CONSTRUCTS TO REUSE STANDARDS, SOME OF THE SAME CHALLENGES AN OPPORTUNITIES TO LEVERAGE ACROSS DIFFERENCE ORGANIZATIONS SO I'M ENCOURAGED BY THE DISCUSSION TODAY AND CONTINUING DISCUSSIONS AFTER THIS WORKSHOP. >> I THINK I HAVE ENJOYED SEEING ALL THE ADDITIONAL PIECES THAT WE CAN ADD TO OUR TOOL CHAIN INCLUDING UPSTREAM TOWARDS OUR INSTRUMENT DEVELOPMENT AND PROTOCOL DEVELOPMENT AND DOWNSTREAM LOOKING AT HOW WE CAN MAKE DATA SETS AND DATA DOCUMENTATION ACCESSIBLE BOTH IN THE CONTEXT OF A DATA ENCLAVE AND PUBLICLY AVAILABLE. SO ADDING THOSE PIECES TO OUR TOOL SET WILL HELP US MOVE FORWARD. >> SEE IF MY THROAT HOLES OUT. I WAS ACTUALLY VERY GLAD TO HEAR DISCUSSION ABOUT HOW PEOPLE ARE USING OUR FIRST PASS AT A METADATA STANDARD FOR THE NATIONAL CHILDREN'S STUDY AND HOW IT IS A VERY GOOD START BUT THERE IS WE HAVE GROUND WORK SET FOR BETTER EVOLUTIONARY CHANGES GOING FORWARD. SO LOOKING FORWARD TO SOME OF THAT. >> AS WE ALL KNOW, IF WE HAVE BEEN INVOLVED WITH THE NCS WE'RE IN A CONSTANT STATE OF CHANGE AND IMPROVEMENT ON UPGRADING. ONE THING THAT HEARTENED ME ABOUT THIS PARTICULAR MEETING WAS THAT IT GAVE US BOTH A COMBINATION OF DIRECTIONS IN WHICH WE CAN MOVE EFFECTIVELY TO MAKE REAL IMPROVEMENTS IN THE SHORT TERM WITH THE CONTEXT MAKING LONGER TERM GAINS AN PARTICULARLY LEVERAGING AND PARTNERSHIP ASPECTS THAT MADE THEMSELVES EVIDENT IN WHAT WENT ON. >> I'M RUTH BRENNER FROM THE PROGRAM OFFICE, I ALSO FOUND THIS TO BE A REALLY INTERESTING AND VALUABLE MEETING. LOT TO DIGEST IN ONE DAY SO LOOKING FORWARD TO SEEING SLIDE SETS AND HAVING TIME TO THINK HOW WE CAN BEST INCORPORATE THE INFORMATION WE LEARNED TODAY TO THE NCS. >> I'M FRANK WHITE AND I'M WITH BOOZE ALLEN AND NEW TO THIS ENTIRE PROGRAM SO EVERYTHING TODAY WAS EYE OPENING AND DRINKING FROM A FIRE HOSE BUT A LOT OF EXCITING INFORMATION AND I LOOK FORWARD TO LEARNING A LOT MORE OVER THE NEXT FEW MONTHS. >> KRISHNA PARK. I CO-LEAD DATA ANALYSIS TEAM FOR THE PROGRAM OFFICE WITH BRIAN HAUGEN. THIS WAS A VERY I THINK SOMEONE HAS MENTIONED THAT NCS IS TAKING A VISIONARY ROLE IN THIS EFFORT STARTING FROM THE DATA COLLECTION AND MOVING IT UP TO METH DEAS. I CAN SEE SINCE THIS IS A LONG TERM STUDY A LOT OF BENEFIT WILL COME OUT OF THIS MODEL. I WANT TO SAY THAT AS AN ANALYST, BECAUSE OF THE RESOURCES PUT INTO PLACE, THINGS LIKE THIS, METADATA REPOSITORY, OR THESE TYPES OF EFFORTS, FUTURISTIC EFFORTS RATHER THAN FOR EXAMPLE, HAVING BETTER QUALITY ASSURANCE AND QRKAQ SYSTEM RIGHT NOW. IT MAKES MY JOB DIFFICULT BECAUSE I HAVE TO DEAL WITH ALL THESE DIFFERENT DATA PIECES ARE ARE COMING FROM DIFFERENT SOURCES. BUT I BELIEVE THIS WHOLE PROCESS WILL BENEFIT THE FUTURE OF THE NATIONAL CHILDREN'S STUDY AND FOR MANY OTHER ENDEAVORS AS WELL. I'M HAPPY TO BE PART OF THE NCS. >> JEREMY WILLIAMS I WORK WITH BILL AT THE COR NE INSTITUTE FOR -- CORNELL INSTITUTE. AS A FT SOFTWARE DOOP -- SOFTWARE DEVELOPER. AND MEETINGS LIKE THIS, WHAT WE NEED IS P DEVELOPER IS THINK AND IT SEEM IT IS MORE METADATA ORIENTED MEETINGS I GO TO THERE'S A LARGER CONTINGENT OF SOFTWARE DEVELOPERS. THAT'S REALLY INTERESTING, MY HOPE IS TO BUILD TOOLS THAT IS DEMAND FOR SO THIS MEETINGS LIKE THIS PROVIDE A GREAT CONTEXT. SO THERE ARE A LOT OF TERMS TO LOOK UP LATER AND I LOOK FORWARD TO LEARNING MORE ABOUT BUT THIS HAS A GOOD CONTEXT, THERE'S A LOT OF GREAT CONTEXT. >> AS DIRECTOR OF THE SOCIOECONOMIC RESEARCH CENTER AT CORNELL I'M INCREASINGLY SEEING RESEARCHERS FROM THE HEALTH SCIENCES TRYING TO COLLABORATE BACK AND FORTH AMONG DISCIPLINES SO THIS WAS A TERRIFIC EXPERIENCE FORf8<>o TODAY. I WOULD LIKE TO CHALLENGE THE FOLKS AT NCS. DON'T SEE THERE'S DOUBT YOU NEED TO THINK LONGER THAN TWO DECADES AS FAR AS LONGITUDINAL DATA. I CAN'T BELIEVE THE CASE WON'T BE MADE IN 20 YEARS, AS THESE INFANTS BECOME 21-YEAR-OLDS, THERE'S A GREAT DEMAND TO CON TO FOLLOW THEM AND THERE'S CLEARLY A NATIONAL EARLY ADULT SURVEY GOING ON AT SOME POINT IN THE FUTURE. SO THIS IS REALLY LONGITUDINAL DATA ACROSS A LIFE SPAN I HOPE. >> VERY HELPFUL TO LEARN MORE ABOUT THE NCS, I DIDN'T KNOW TOO MUCH ABOUT THE VARY TWO, THREE WEEKS AGO AND I LEARNED A LOT. AS A COMPANY FOUNDATION INDIVIDUAL DDI EXPERT, I'M THANKFUL TO THE CHANCE TO PRESENT TODAY AND HOPEFULLY TO WORK TOGETHER IN THE FUTURE, IF WE CAN HELP YOU WITH DDI WE WILL WELCOME THAT OPPORTUNITY AND I THINK WE HAVE A LOT TO LEARN. HEALTH DATA IS NEW TO THE DD RK COMMUNITY AND THERE'S THINGS WE CAN LEARN FROM EACH OTHER. SO COLLABORATION WILL BE GREAT ON THIS EXTREMELY IMPORTANT STUDY. SO THANK YOU VERY MUCH. >> I WORK WITH PASCAL, INTERESTING TO SEE HOW TECHNOLOGY, SOCIAL SCIENCE AND HEALTH THEY'RE ALL TOGETHER AND INTERESTING TO SEE. ALSO WHEN THIS IS STARTING BECAUSE I WANT MY CHILDREN TO BE PART OF THIS. LET ME KNOW. I'LL PLAN ACCORDINGLY. >> (INAUDIBLE) BOOZ ALLEN HAMILTON. I WORK BEHIND THE SCENES ON NDS MDR, AND VERY INTERESTING TO SEE THIS FROM ANOTHER PERSPECTIVE. GLAD TO HEAR FROM (INAUDIBLE) AND TO KNOW WHAT THEY ACTUALLY NEED (INAUDIBLE) AND PROMISE THEY WANT TO SEE. THANK YOU. >> (INAUDIBLE) I WORK ON THE MDR TEAM. AS WE WORK ON MDR, ONE OF THE MANY QUESTIONS WE TRY TO ANSWER, WHAT CAN WE DO WITHIN THE -- TO MAKE IT USEFUL TO THE COMMUNITY, TO NCS? I THINK COMING TO THIS WORKSHOP HAS BROADENED THEB THE MDR. FOR INSTANCE, THE TOPIC THAT WAYNE GAVE IN HIS TALK ON GLOBAL (INAUDIBLE) REPOSITORY HAS BEEN VERY THOUGHT PROVOKING. RIGHT NOW THINKING OF MDO WE THINK IN TERMS OF (INAUDIBLE) STANDARDS. BUT THE FACT SO MANY SENORS ARE CONSIDERED AND HOW WE CREATE SOMETHING THAT WILL BE GENERALLY USEFUL, HAS BEEN REALLY USEFUL AND HE'S -- THEIR EXPERIENCE WITH THE MDS AND BECAUSE THAT TIES INTO WHAT WE CAN DO ART FACTS LIKE COMPUTATIONAL ABILITY AND IN THAT CASE THEY CAN ACTUALLY HELP SO OVERALL THIS WORKSHOP HAS BEEN USEFUL IN HELPING ME THINK ABOUT HOW WE WANT TO G FORWARD WITH THE MDR. >> JENNIFER CLONE WITH THE PROGRAM OFFICE. I HAVE TO SAY PRIOR TO THIS EXPERIENCE MY METADATA EXPOSURE HAS BEEN THROUGH JAG S WHERE MEGADATA CENTERS ARE DEFINED AND WHETHER THEY'RE ADHERING TO OR NOT IS ANOTHER QUESTION BUT THERE'S A FRAMEWORK TO WORK WITH. I FELT THIS DIALOGUE WAS ENLIGHTNING AND WHAT WAS REALLY WONDERFUL ABOUT IT WAS A LOT OF US ARE ON THE SAME PAGE AS FAR AS WHERE WE GO FROM HERE. I THOUGHT THAT WAS CERTAINLY HEARTENING. AND CERTAINLY A GOOD EXPERIENCE. THANK YOU. >> MY NAME IS KEVIN (INAUDIBLE) FROM JOHNS HOPKINS STUDY SENOR. IT IMS COORDINATOR THERE SO (INAUDIBLE) EFFORTS AND COORDINATING EFFORTS. I JOINED A CENTER IN JUNE LAST YEAR BEING IN THE COMMERCIAL IT ENVIRONMENT FOR A WHILE. MY CHANGE OF THAT, BRING MY KNOWLEDGE AND SKILLS TO THE HEALTH AREA TO HELP CHECK DATA, I HAVE TO SAY BEFORE JOINING THE SENOR ONE THING TO FIGURE OUT THISES WHERE MY HAT WAS AND HOW TO HANDLE METADATA, TO MAKE IT EASIER FOR THIS TO ANSWER THE QUESTIONS BECAUSE THAT'S (INAUDIBLE) COME IN. THIS CONFIRMS THEY'RE IN THE RIGHT FIELD. I OFTEN SEE OUT THERE FOR US TO BRING (INAUDIBLE) TO THE TABLE AND HELP (INAUDIBLE) ASK THE QUESTIONS. >> THIS IS MIKE SINCLAIR. I'M LEADING UP TO DATE A LINKAGE PROJECT AT NORC. BEING A LONG TERM IS STATISTICIAN QUITE INTERESTING IN SHIFTING PERSPECTIVES WHAT WE MIGHT THINK OF AS DATA LINKAGE. WE HAVE BEEN WORKING ON TRYING TO IDENTIFY A VARIETY OF DIFFERENT EXIGENT DATA SOURCES IN THE TRADITIONAL STATISTICAL SETTING TRYING TO COMBINING DATA FOR A GROUP OF INDIVIDUALS THAT SHARE GEOGRAPHY OR CHARACTERISTICS. THIS WAS VERY INTERESTING TO TRY TO BROADEN THIS AREA TO WHAT WE MIGHT CALL DATA ASSOCIATION, BRINGING TO BEAR INFORMATION AB VARIOUS STUDIES IN TERMS OF HOW THEY WERE CONDUCTED GENERAL FINDINGS, THE WAY THE QUESTION WAS ASKED INTO THE ASSOCIATION FRAMEWORK. THAT'S HELPFUL. I'M GLAD TO SEE THIS IS LOOKED AT IN OTHER STUDIES. WE HAVE COMPONENTS IN PLACE, WE JUST DONE ALWAYS THINK AB LINKING THEM TOGETHER. AS I WAS TALKING WITH ANDREW AND KRISTINA PARK EARLIER TODAY, WE OFTEN GO OUT AND LOOK FOR WHAT OTHERS HAVE DONE OBVIOUSLY. WE IDENTIFY THOSE SURVEYS. WE DEVELOP THOSE COMPANION QUESTIONNAIRES. WE DON'T TAKE THE TIME TO GO OFTEN FORWARD AND BUILT IN OTHER METADATA AND INFORMATION ABOUT THAT SURVEY. WE TRY TO LINK SURVEYS USED TO DEVELOP THE QUESTIONNAIRE TO THE SURVEY DATA AND INFORMATION ABOUT THOSE SURVEYS THAT WE CAN LINK TO THE NCS INFORMATION. >> THIS IS NED ENG LIRK NRC UNIVERSITY OF CHICAGO. THINKING AS A SURVEY METHODOLOGIST IT WAS HELPFUL TO HEAR ABOUT THE STATE OF THE ART FROM A NON-PURE SURVEY PERSPECTIVE, MORE A MEDICAL AND IT SPEAKING ABOUT WHAT WE DO. SURVEY DATA PER SE. I WOULD LIKE TO SECOND THE PREVIOUS SENTIMENT ABOUT A LONGER TERM PANEL. MANY KEY PANEL STUDIES OTHERS IN EUROPE HAVE BEEN LONGER THAN THAT. BUT THANKS. >> I TOOK A LOOK AT OBJECTIVES OF THE MEETING. SEEMS LIKE A DIRECT HIT IN TERMS OF THE THREE MAIN OBJECTIVES THAT YOU SET OUT TO MEET IN A CONSTRAINED TIME FRAME, SO IMPRESSIVE COMING INTO THIS COLD. THIS WAS AN EYE OPENING EXPERIENCE, I COME OUT OF MEETINGS FEELING LIKE I HAVE A GREAT DEAL OF HOME WORK TO DO IN ORDER TO FOLLOW-UP ON ALL THE GREAT NEW IDEAS I WAS EXPOSED TO. TALKING TO OTHER STUDIES THAT SEEM TO BE SIMILAR TO WHAT'S GOING ON IN AUSTRALIA AND CANADA. INTERESTING TO LEARN FROM THEM AND GAIN PERSPECTIVES, YOU MIGHT AFFECT THIS PROJECT. FROM THE (INAUDIBLE) PERSPECTIVE I SEE COMMON INTERESTS AN OVERLAPS, I HOPE WE'LL BE ABLE TO STAY WORKING TOGETHER AND PARTNER IN THE FUTURE. WE'RE CERTAINLY GOING IN SIMILAR DIRECTIONS AN COVERING A LOT OF SIMILAR TERRITORIES. THANK YOU VERY MUCH. >> DOUG THORP, UNIVERSITY OF UTAH. MY GOAL COMING HERE WAS TO TRY TO CATCH UP ON THE CONVERSATION AN DEVELOPMENT OF DATA SYSTEMS AT THE NCS AND THIS HAS BEEN HELPFUL. TRYING TO DO ANALYSIS LINKING THE LEGACY OR PHASE ZERO DATA OF SOME OF THE LIGHT TOUCH OR PHASE 1 DATA FROM OUR VAN GAWR SENOR DATA. THAT'S INCREDIBLY CHALLENGING AND DIFFICULT. AND THAT'S JUST LOOKING BACK OVER TWO YEARS. SO THE IDEA' CLEAR WE HAVE TO DEVELOP SOMETHING TO MAKE IT SIMPLER AN EASIER TO WORK BACKWARDS OVER DECADES, PARTICULARLY AS WE HAND THIS OFF WITH STAFF TURN OVER, THIS IS WONFUL TO WATCH HERE. AS I WORK FOR THE STUDY I HAD A CHECKLIST OF QUESTIONS AND ISSUES AND I HAVE KIND OF BEEN CHECKING THEM OFF IN MY MINE AND SEEING A LOT OF POTENTIAL TO SOLVE PROBLEMSCH AS A SOFTWARE DEVELOPER, THE CARDINAL SIN OF SOFTWARE DEVELOPMENT IS DUPLICATION. I SEE THIS METADATA REPOSITORY AND SINGLE SOURCE ABILITY TO GENERATE INSTRUMENTS AS A GREAT OPPORTUNITY TO AVOID THAT DUPLICATION. IT'S EXCITING TO SEE THE CONSENSUS AROUND THAT HERE. MISSED MY TURN, TIM (INAUDIBLE) UNIVERSITY OF CHICAGO, THANKS SO MUCH FOR THE INVITATION TO BE HERE. REALLY ENJOYED THE PRESENTATIONS TODAY. WHAT DR. HIRSCHFELD HIT ON IS THE MOST IMPORTANT PART TO ME IS THE IDEA OF TRUE PARTNERSHIPS. PASCAL AND I HAD BEEN WORKING THIS METADATA PROTHILL ADVERTISING -- PROS THEY WILLTIZING PEOPLE AND THERE'S A SIGNIFICANT UPTAKE IN THIS AND I COME TO A MEETING AN CAN'T BE MORE ENTHUSIASTIC ABOUT YOU GET THE RIGHT PLAYERS AT THE MEETING, PEOPLE ARE STARTING TO GET THIS, P PERMEATING THROUGHOUT THE DIFFERENT DISCIPLINES SO I APPLAUD EVERYBODY HERE AS WELL. BUT I WANT TO END WITH WHAT I STARTED, AGREEING WITH THE IDEA OF PUSHING PEOPLE TOGETHER YOU HAVE A LOT OF LIKE MINDED PEOPLE IN THIS TOGETHER THAT ARE A LOT DIFFERENT THAN PEOPLE MIGHT THINK. THANKS AGAIN FOR THE INVITATION. >> >> I WAS EXCITED TO HEAR THROUGHOUT THE DAY DISCUSSIONS AROUND ONTOLOGIES AND THE SEMANTIC AROUND THE RELATIONSHIPS DYNAMICS BEHAVIORAL SEMANTICS AND HOW WE CAN GO FROM A METADATA STRUCTURE AND HOW THAT CONNECTS WITH RDF AND ONTOLOGY AND SPARKLE QUERIES SO I'M EXCITED TO SEE HOW THAT DEVELOPS OVER TIME. AND ALSO IDEAS WAYNE BROUGHT UP OR THE PROGRAM, THE CLINICAL INFORMATION MODELING INITIATIVE AND HOW THERE MIGHT BE A TOUCH POINT WITH THAT AND HEALTH TRAJECTORY DISCUSSION FROM EARLIER IN THE DAY. ALSO THE CONCEPT OF DOMAIN ANALYSIS MODEL AND HOW THAT MIGHT EVOLVE OUT FROM THE PEDIATRIC TERMINOLOGY FRAMEWORK. THE DEVELOPMENTAL STAGES AND HOW THAT DEFINES ANALYSIS MODELING, GOING FROM THAT ANALYSIS MODEL TO THE IMPLEMENTATION AND GETTING THOSE PIECES IN PLACE. YOU SAW THOSE DIFFERENT COMPONENTS REPEATED THROUGHOUT THE TALKS TODAY. >> (INAUDIBLE) WARN REN KIBBE AGAIN. -- WARREN KIBBE AGAIN. 'S EXCITING TO SEE ALL OF US PRESENTING DIFFERENT ASPECTS OF SOME OF THE EXACT SAME ISSUE, HOW WE MOVE FORWARD, AND I THINK FOR ME PERSONALLY IT'S BEEN GREAT BEING INVOLVED IN NCS AND THINGS THAT HAPPENED AND THE GREAT RELATIONSHIPS WITH ALL THE PEOPLE INVOLVED IN NCS. THAT'S PART OF THE FUN OF IT. US ALL WORKING TOGETHER. THE ONLY DON SIDE IS MARK ADAMS COMES AFTER ME AND HE WILL SAY MANAGE MORE PROFOUND THAN I. THAT'S HORRIBLE. >> VERY LOW RISK. >> PLEASURE TO HAVE THE LAST WORD BECAUSE I HAVE THREE GIRLS AT HOME. I NEVER GET THAT. IT'S TERRIFIC. ONE OF THE THINGS THAT WAS INTERESTING TO ME IS LOOKING HOW MUCH CONVERGENCE THERE IS IN THIS VERY DIVERSE FRANKLY GROUP WITH WHAT DR. HISH FELL OUTLINED SO THE CORE VALUES WE TRY TO BUILD THIS UPON. THERE CLEARLY IS OPPORTUNITY OUT THERE I THINK AS WAS SAID EARLIER TO GO IN AND MAKE USE OF THE TOOLS, CAPABILITIES AND FRANKLY THE TALENTS OUT THERE AND NOT HAVE TO REINVENT THIS PARTICULAR WHEEL. AS WE HAVE SAID, GETTING THIS RIGHT IS GOING TO DEPEND ON REALLY TAKING A HARD CLOSE LOOK AT LESSONS LEARNED IN PREVIOUS EFFORTS, PREVIOUS ACTIVITIES ANDREWING UPON THOSE ACTIVITIES SUCCESSFUL SO THANK YOU VERY MUCH TO EVERYBODY FOR COMING AND SHARING YOUR EXPERTISE AN EXPERIENCE, GREAT. WE HAVE A GOOD WAY FORWARD. >> THANK YOU ALL. NOTE THAT EVERY CONCLUSION IS A WAY TO HAVE A NEW BEGINNING SO WE'RE BEGINNING THE NEW ERA OF METADATA. SOME OF YOU MAY ASK WHY I'M IN UNIFORM, BECAUSE MY REAL JOB IS TO FUNCTION AS A CHAUFFEUR AND I HAVE TO GET INTO THE PARENT TAXI MOMENTARILY. SO OUR CHILDREN ARE OUR FUTURE, I HAVE TO GO OFF AND ADDRESS OUR FUTURE SO GOOD NIGHT.