Informatics Case Study: From Data Management and Integration to
Transcription
Informatics Case Study: From Data Management and Integration to
Informatics Case Study: From Data Management and Integration to Marker-Based Diagnostics Models Bob Stanley VP, Chief Technology Officer IO Informatics, Inc. Emeryville, CA www.io-informatics.com Beyond Genome, 2006 NIST Advanced Technology Program NIST / Advanced Technology Program 2002: Icoria awarded 5 year $11.7M grant on Target Assessment Technology using Systems Biology 2005: Icoria & IO-Informatics 2 year joint venture for data integration using Intelligent Multidimensional Object (IMO) Technology Major Milestones: Coherent Data Data management - deployment, scale, testing Sentient Data Management and Suite Data-driven efficiency - workflow, process management Sentient Process Manager Associative networks – diagnostics modeling, screening Sentient Knowledge Explorer A Clinical Data, Inc. Company Data / Information Management: Integration and Scale Testing Phase Requirement: “My users must be able to [simply] drag into a folder or right-click on any file or set of files to enter data into the [‘iPool’ ] database.” Director, Icoria division of CLDA Files - applications output Instrument output - images, meta-data LIMS - database records Database query results Web database query results “Getting started” Architecture Our Goals: Avoid heavy time investment for initial roll-out phase Short runway to roll-out with tangible value Initial roll-out requirements: Lightweight scale-up installation, existing apps integrated, scale tested Refine use cases / demonstrate value for user roles prior to rollout Result: Different Data, One View User experience: Easy import, unified curation, unified views, annotation, queries, reports, auditing, integrity checking Scale-up Efficiency Our Goals: Scale-up based on approved requirements, use cases and reviewed prototypes Maximize use of existing IT and applications Process Management Requirement: “We need a system for process management that can take data and workflows from each of the many groups associated with Icoria into account within a larger project framework.” Chief Science Officer, Icoria Knowledge Explorer: Prototyping and Testing Requirements gathering / prototyping phase Interview Icoria users, customers, science advisers, peers Iterate and refine existing Sentient products and roadmap via use cases User Requirements: Knowledge Explorer Example Requirement: “We’d like to create associative networks to model and screen for diagnostic markers. This should represent genes, metabolites, tissue data, compounds and clinical endpoints associated by - for example - common identifiers, foldchange, strength of correlation and reference to external knowledge-bases” Director, Systems Biology, Icoria Prototype - networks of related knowledge created to validate liver toxicity and cancer metabolic markers Application - Use Case: Knowledge Explorer Applied to: Liver toxicity – NIEHS / Icoria Compendium study - metabolic marker and study data in Oracle, supporting gene expression and tissue data accessed via Sentient Cancer markers – “Cancer Study” - BCP markers with supporting data Immediate value to Icoria researchers and clients and ultimately to the point of care Contextualized visualization and comparison of markers delivers understanding, validation, sharing, screening Applications include result reporting; case, study and disease stratification; adverse event reporting Resulting Prototype: Knowledge Explorer Refinement: Icoria, peers, Scientific Advisors and Partners review Implementation - IO and semantic methods (IMO, RDF, OWL) Summary Major “Coherent Data” Milestones: Current use – integrated data management and query functions Users are now able to run their own queries on formally inaccessible data Rolling out – unified workflow, process management functions Data-driven alerts for project actions and deadlines becomes possible Prototyping and refining - knowledge explorer / modeling functions Unified diagnostics modeling, viewing and screening – using data from diverse sources - becomes possible Keys to Success? We’ve focused on: Low-impact, targeted communication and implementation Immediate, signed-off, practical benefits – and growing from there! References References: Bouquet, P.; Giunchiglia, F.; van Harmelen, F.; Serafini, L.; Stuckenschmidt, H. C-OWL: contextualizing ontologies. Web Semantics: Science, Services, and Agents on the World Wide Web 2004, 1, 325–43. Glassbrook, N.; Ryals, J. A systematic approach to biochemical profiling. Curr. Opin. Plant Biol. 2001, 4(3), 186–90. Gombocz, E.; Stanley R. Achieving interoperability in Systems Biology: New informatics methods for user-centric, lightweight integration of heterogeneous data" Poster at the 4th International Symposium on Challenges in Systems Biology at the Institute for Systems Biology in Seattle, WA, April 24-25, 2005 Hancock, W.; Wu, S.; Stanley, R.; Gombocz, E. The challenge of publishing large proteome datasets: the meeting of scientific policies and emerging technologies.Trends Biotechnol. (suppl.) 2002, 20(12), 39–44. Stanley, R.; Hancock, W. Bioinformatics in the clinic: challenges and opportunities for improved trials and clinical care. Genom. Proteom. Tech. 2003, 3(3), 29–36. Wang X.; Gorlitsky R.; Almeida, J.S. From XML to RDF: how semantic web technologies will change the design of “omic” standards. Nat. Biotech. 2005, 23(9), 1099–1103. Acknowledgements IO Informatics would like to thank the following: Icoria Division of Clinical Data, Inc. (CLDA) Cogenics Division of CLDA Icoria National Institute of Science and Technology (NIST ATP) Contributors include: Imran Shah, Principal Investigator, Icoria, CLDA Kevin Lutz, Grant Administrator, Cogenics, CLDA Paul Dibello, Associate Director, Software Engineering, Cogenics, CLDA Tim Hall, Project Manager, NIST Erik Puskar, Business Manager, NIST Erich Gombocz, Chief Science Officer, IO Informatics Chuck Rockey, Software Project Manager, IO Informatics Mike Travers, Principal Engineer, Knowledge Systems, IO Informatics Tom Colatsky, Maureen McBride, Omid Omidvar, Alan Higgins, Pat Hurban, Max Fedor, Hongkang Mei, Judong Shen and others who have contributed.