Workflow

Transcription

Workflow
Insid&Out: how to save Medicinal chemists from data-related frustration
Towards a Unified Laboratory Intelligence
Fabio Rancati
Lead Optimization unit 1 - Head
Parma – 7th July 2016
Summary

Chiesi R&D
•

Chiesi data exchange Platform
•


2
General and Preclinical
Implementation non-human based system data upload
Integration of calculation package
Conclusion
Rancati Fabio | Parma – 7th July 2016
Chiesi R&D




R&D is mainly focused on respiratory area
>400 scientists in Parma R&D centre
Preclinical 90 Scientists
All projects run in collaboration with CROs
•
•




3
Integrated big collaborations
Ad hoc collaboration task oriented
>60% of work is outsourced
50-90% of data produced outside Chiesi depending on project
phase
IT support focused on infrastructure
Limited chemoinformatic support
Rancati Fabio | Parma – 7th July 2016
Med Chem & data

Data is food for the Med Chem’s mind
Synthesize
Design
Test
Analyze Data

Single data has no meaning if not part of a wider dataset (database)
“I cannot work without a database”
Luca Sartori 6th June 2012 - Parma
4
Rancati Fabio | Parma – 7th July 2016
Med Chem & database: almost complete strangers
“Why we do not create an excel database?”
“I searched Chemfinder for ADME data, and I didn’t find”
‘I’ll keep updated the power point presentation with all data”
“All you need is an
a DATABASE”
excel spread sheet”
5
Rancati Fabio | Parma – 7th July 2016
Med Chem & data: a difficult relationship
6
Rancati Fabio | Parma – 7th July 2016
The starting point
Corporate dB CHF
Only chemistry data
Local dB
Project 1
7
Project 2
Project 3
Project …

Management of local databases was frustrating

Often exchange of the entire local dB with CRO collaborator for updating

Versioning
Rancati Fabio | Parma – 7th July 2016
Project purpose

Create a unique repository of data

Eliminate all project related local dBs

Med Chem oriented

Make easier importing data produced by CROs

Share data

Data management

Security and safety of data

Limited internal knowledge and resources
Limited possibility of assembling a pool with best tool for each purpose
8
Rancati Fabio | Parma – 7th July 2016
Electronic LABoratory Knowledge Management
Local dB
Project 1
Project 2
Project 3
Project …
dB CHD
Data upload
Pharma
Analytical
Phys Chem ….
Project Objectives
9

Consolidate project dBs in one central dB: Chem & Pharma data

Increase security of data

Avoid versioning

Share data in an easier way internally and with CROs
Rancati Fabio | Parma – 7th July 2016
Tool for data
Visualization
and Analysis
Workflow
CHD code
CRO1
Compound Registration
CHD Generation
CRO3
CRO…
Data generation
Several file formats
Cloud
ELN
CRO2
SD file
Compound synthesis
in Chiesi or @CRO
Chiesi
Laboratories
data upload
File download
data upload

Registration of compound is the limiting step for Data release and upload

Data exchange based on different templates
Chiesi dB
*Images used are all public
10
Rancati Fabio | Parma –
7th
July 2016
Workflow

System human based, not automatic

Alerting required to start any registration step @Chiesi

Not systematic (data file missed)

Data duplication (Chiesi and CRO site)

Delay from data generation to data publication

Based on availability of key users

Data not available
Many factors including Simple Human error like
typo are often cause of FRUSTRATION in daily
work, e.g. in the middle of a meeting when your
boss asks for that piece of data
11
Rancati Fabio | Parma – 7th July 2016
Workflow improvement - analysis

Process automation

Avoid alerting

Limit human intervention
•
Missed file
•
Repeated file
Intercept and correct human error

The solution to these problems was the use of data mining tool
But before this
12

Generation of a set of templates for data exchange for each protocol

Identification of a dictionary of most common error in data provided by CRO

Detailed instruction given to CRO (procedure)
Rancati Fabio | Parma – 7th July 2016
Workflow implementation

One template for each protocol
•
13
Template with hidden sheet containing protocol info for identification

Same template for all CROs

Process automation

Sync cloud with local resources

Avoid alerting

Identify new files or new version

Limit human intervention
and missed file

Combine file for same protocol

Intercept human error

Correct most common errors
Rancati Fabio | Parma – 7th July 2016
Workflow implementation
Sync
14
Rancati Fabio | Parma – 7th July 2016
Workflow implementation
Sync
15
Rancati Fabio | Parma – 7th July 2016
Workflow implementation
Sync
16
Rancati Fabio | Parma – 7th July 2016
Workflow achievement
Few factors limit the workflow efficiency
17

Before data release by CRO compounds have to be registered for the generation of
compound code

Human error in filling data template

Knime workflow performs several data integrity checks

Data uploaded to dB in a different set of tables every night

Minimal delay from data release to data publication

Everything was done without any change in end-user’s behavior or habit
Rancati Fabio | Parma – 7th July 2016
Workflow improvement
% of file containing errors
35
30
Standard
template and
procedure to
CROs
25 Initial situation
20
15
10
1 Year
after
5
0
Changes in data exchange procedure caused increasing number of
file refused by Knime Workflow
18
Rancati Fabio | Parma – 7th July 2016
Workflow achievement
Sync
Central dB

Implemented Workflow made data transfer CRO2Chiesi more efficient

Reduced delay from data generation to publication

Human error requires continuous CRO monitoring and education

19
•
Frequent changes in CRO team
•
Scientists in CROs are human been
•
Errors are normal and can be limited
•
Fantasy is limitless
Less data missed = lower Med Chem’s frustration
Rancati Fabio | Parma – 7th July 2016
Med Chem & chemoinformatic: right tool in wrong hands
Data mining like mean calculation, pivoting or statistical analysis could become
dangerous tools in wrong hands if you do not kwon data structure and
relationship.
5 compounds
Plasma stability
7 species 2 provider
Hepatocytes
6 species 2 provider
Automatic pivot generates up to 168 permutations/compound
This real case generated 311 rows instead of 5
20
Rancati Fabio | Parma – 7th July 2016
Integration of calculation package

Percepta is common tool for properties calculation

Desktop for single compound analysis or user managed calculation of batch properties

Batch module needed for automatic calculation run from command line

Knime can be useful to prepare input file and upload calculated data in dB

Main difference desktop/batch is in pKa output (graphical/tabular)
* by Advanced Chemistry Development,
Inc. (ACD/Labs)
21
Rancati Fabio | Parma – 7th July 2016
Integration of calculation package
pKa values visualization in Desktop version
22
Rancati Fabio | Parma – 7th July 2016
Workflow Percepta – standard pKa output
pKa values visualization in Batch version
Atom number pKa
23
2
8.77
29
10.29
43
8.77

Link through atom number

Not always easy to identify the atom in big molecule

Similar molecule with different atom numbering causes confusion
Rancati Fabio | Parma – 7th July 2016
Workflow Percepta Batch

Knime workflow run automatically at night-time

Extracts all new compounds from corporate dB and generates the SDfile

SD file is used in the command line string to run Percepta batch

SD file is manipulated to change atom’s properties and attach the calculated properties as
atom label

Output SD file is modified to embed pKa calculated value and convert into an image (filling
the gap between the desktop and batch module)

Provide the same user experience of Desktop version
•
24
Calculation results easier to “read”
Rancati Fabio | Parma – 7th July 2016
Workflow Percepta
25
Rancati Fabio | Parma – 7th July 2016
Workflow Percepta – our modified pKa output

The modified pKa output is for sure much easier to read

No confusing number are present in the generated picture as in Desktop version
26
Rancati Fabio | Parma – 7th July 2016
Conclusion
To save a medicinal chemist from data-related frustration
27

provide data needed for daily work

do not ask to think about where data is stored or generated (INSIDE or OUTside the
company) otherwise he could feel ANGER

do not ask to use a chemo-informatic tool himself as he could feel FEAR or in certain
cases even DISGUST.

To be a JOY for a medicinal chemist any new additional tool must solve a problem
and not generate SADNESS. He often considers those tools something that does a
magic.

Medicinal CHEMistry and CHEMoinformatic have something in common
Rancati Fabio | Parma – 7th July 2016
Conclusion
28
Rancati Fabio | Parma – 7th July 2016
Acknowledgement
Chiesi colleagues
Ferramola Camilla
Annamaria Capelli
Andrea Rizzi
Marcello Biscaioli
Med Chems
Elisabetta Armani
DV Informatica
S-IN Soluzioni informatiche
Angelo Rigamondi
Andrea Ciacci
Giuseppe Martinoli
29
Rancati Fabio | Parma – 7th July 2016
Thank you
30
Rancati Fabio | Parma – 7th July 2016