Workflow
Transcription
Workflow
Insid&Out: how to save Medicinal chemists from data-related frustration Towards a Unified Laboratory Intelligence Fabio Rancati Lead Optimization unit 1 - Head Parma – 7th July 2016 Summary Chiesi R&D • Chiesi data exchange Platform • 2 General and Preclinical Implementation non-human based system data upload Integration of calculation package Conclusion Rancati Fabio | Parma – 7th July 2016 Chiesi R&D R&D is mainly focused on respiratory area >400 scientists in Parma R&D centre Preclinical 90 Scientists All projects run in collaboration with CROs • • 3 Integrated big collaborations Ad hoc collaboration task oriented >60% of work is outsourced 50-90% of data produced outside Chiesi depending on project phase IT support focused on infrastructure Limited chemoinformatic support Rancati Fabio | Parma – 7th July 2016 Med Chem & data Data is food for the Med Chem’s mind Synthesize Design Test Analyze Data Single data has no meaning if not part of a wider dataset (database) “I cannot work without a database” Luca Sartori 6th June 2012 - Parma 4 Rancati Fabio | Parma – 7th July 2016 Med Chem & database: almost complete strangers “Why we do not create an excel database?” “I searched Chemfinder for ADME data, and I didn’t find” ‘I’ll keep updated the power point presentation with all data” “All you need is an a DATABASE” excel spread sheet” 5 Rancati Fabio | Parma – 7th July 2016 Med Chem & data: a difficult relationship 6 Rancati Fabio | Parma – 7th July 2016 The starting point Corporate dB CHF Only chemistry data Local dB Project 1 7 Project 2 Project 3 Project … Management of local databases was frustrating Often exchange of the entire local dB with CRO collaborator for updating Versioning Rancati Fabio | Parma – 7th July 2016 Project purpose Create a unique repository of data Eliminate all project related local dBs Med Chem oriented Make easier importing data produced by CROs Share data Data management Security and safety of data Limited internal knowledge and resources Limited possibility of assembling a pool with best tool for each purpose 8 Rancati Fabio | Parma – 7th July 2016 Electronic LABoratory Knowledge Management Local dB Project 1 Project 2 Project 3 Project … dB CHD Data upload Pharma Analytical Phys Chem …. Project Objectives 9 Consolidate project dBs in one central dB: Chem & Pharma data Increase security of data Avoid versioning Share data in an easier way internally and with CROs Rancati Fabio | Parma – 7th July 2016 Tool for data Visualization and Analysis Workflow CHD code CRO1 Compound Registration CHD Generation CRO3 CRO… Data generation Several file formats Cloud ELN CRO2 SD file Compound synthesis in Chiesi or @CRO Chiesi Laboratories data upload File download data upload Registration of compound is the limiting step for Data release and upload Data exchange based on different templates Chiesi dB *Images used are all public 10 Rancati Fabio | Parma – 7th July 2016 Workflow System human based, not automatic Alerting required to start any registration step @Chiesi Not systematic (data file missed) Data duplication (Chiesi and CRO site) Delay from data generation to data publication Based on availability of key users Data not available Many factors including Simple Human error like typo are often cause of FRUSTRATION in daily work, e.g. in the middle of a meeting when your boss asks for that piece of data 11 Rancati Fabio | Parma – 7th July 2016 Workflow improvement - analysis Process automation Avoid alerting Limit human intervention • Missed file • Repeated file Intercept and correct human error The solution to these problems was the use of data mining tool But before this 12 Generation of a set of templates for data exchange for each protocol Identification of a dictionary of most common error in data provided by CRO Detailed instruction given to CRO (procedure) Rancati Fabio | Parma – 7th July 2016 Workflow implementation One template for each protocol • 13 Template with hidden sheet containing protocol info for identification Same template for all CROs Process automation Sync cloud with local resources Avoid alerting Identify new files or new version Limit human intervention and missed file Combine file for same protocol Intercept human error Correct most common errors Rancati Fabio | Parma – 7th July 2016 Workflow implementation Sync 14 Rancati Fabio | Parma – 7th July 2016 Workflow implementation Sync 15 Rancati Fabio | Parma – 7th July 2016 Workflow implementation Sync 16 Rancati Fabio | Parma – 7th July 2016 Workflow achievement Few factors limit the workflow efficiency 17 Before data release by CRO compounds have to be registered for the generation of compound code Human error in filling data template Knime workflow performs several data integrity checks Data uploaded to dB in a different set of tables every night Minimal delay from data release to data publication Everything was done without any change in end-user’s behavior or habit Rancati Fabio | Parma – 7th July 2016 Workflow improvement % of file containing errors 35 30 Standard template and procedure to CROs 25 Initial situation 20 15 10 1 Year after 5 0 Changes in data exchange procedure caused increasing number of file refused by Knime Workflow 18 Rancati Fabio | Parma – 7th July 2016 Workflow achievement Sync Central dB Implemented Workflow made data transfer CRO2Chiesi more efficient Reduced delay from data generation to publication Human error requires continuous CRO monitoring and education 19 • Frequent changes in CRO team • Scientists in CROs are human been • Errors are normal and can be limited • Fantasy is limitless Less data missed = lower Med Chem’s frustration Rancati Fabio | Parma – 7th July 2016 Med Chem & chemoinformatic: right tool in wrong hands Data mining like mean calculation, pivoting or statistical analysis could become dangerous tools in wrong hands if you do not kwon data structure and relationship. 5 compounds Plasma stability 7 species 2 provider Hepatocytes 6 species 2 provider Automatic pivot generates up to 168 permutations/compound This real case generated 311 rows instead of 5 20 Rancati Fabio | Parma – 7th July 2016 Integration of calculation package Percepta is common tool for properties calculation Desktop for single compound analysis or user managed calculation of batch properties Batch module needed for automatic calculation run from command line Knime can be useful to prepare input file and upload calculated data in dB Main difference desktop/batch is in pKa output (graphical/tabular) * by Advanced Chemistry Development, Inc. (ACD/Labs) 21 Rancati Fabio | Parma – 7th July 2016 Integration of calculation package pKa values visualization in Desktop version 22 Rancati Fabio | Parma – 7th July 2016 Workflow Percepta – standard pKa output pKa values visualization in Batch version Atom number pKa 23 2 8.77 29 10.29 43 8.77 Link through atom number Not always easy to identify the atom in big molecule Similar molecule with different atom numbering causes confusion Rancati Fabio | Parma – 7th July 2016 Workflow Percepta Batch Knime workflow run automatically at night-time Extracts all new compounds from corporate dB and generates the SDfile SD file is used in the command line string to run Percepta batch SD file is manipulated to change atom’s properties and attach the calculated properties as atom label Output SD file is modified to embed pKa calculated value and convert into an image (filling the gap between the desktop and batch module) Provide the same user experience of Desktop version • 24 Calculation results easier to “read” Rancati Fabio | Parma – 7th July 2016 Workflow Percepta 25 Rancati Fabio | Parma – 7th July 2016 Workflow Percepta – our modified pKa output The modified pKa output is for sure much easier to read No confusing number are present in the generated picture as in Desktop version 26 Rancati Fabio | Parma – 7th July 2016 Conclusion To save a medicinal chemist from data-related frustration 27 provide data needed for daily work do not ask to think about where data is stored or generated (INSIDE or OUTside the company) otherwise he could feel ANGER do not ask to use a chemo-informatic tool himself as he could feel FEAR or in certain cases even DISGUST. To be a JOY for a medicinal chemist any new additional tool must solve a problem and not generate SADNESS. He often considers those tools something that does a magic. Medicinal CHEMistry and CHEMoinformatic have something in common Rancati Fabio | Parma – 7th July 2016 Conclusion 28 Rancati Fabio | Parma – 7th July 2016 Acknowledgement Chiesi colleagues Ferramola Camilla Annamaria Capelli Andrea Rizzi Marcello Biscaioli Med Chems Elisabetta Armani DV Informatica S-IN Soluzioni informatiche Angelo Rigamondi Andrea Ciacci Giuseppe Martinoli 29 Rancati Fabio | Parma – 7th July 2016 Thank you 30 Rancati Fabio | Parma – 7th July 2016