r - Department of Statistics, Faculty of Science, Chiang Mai University

Transcription

r - Department of Statistics, Faculty of Science, Chiang Mai University
New Scoring Approach and a System in Structure-based Virtual Screening
a*
a
b
c
c
Vannajan
Sanghiran
Lee
,
Piyarat
Nimmanpipug
,
Jeerayut
Chaijaruwanich
,
Sukon
Prasitwattanaseree
,
and
Patrinee
Traisathit
a
Department of Chemistry, Faculty of Science, Chiang Mai University, Chiang Mai, 50200 Thailand
b
Department of Computer Science, Faculty of Science, Chiang Mai University, Chiang Mai, 50200 Thailand
c
Department of Statistics, Faculty of Science, Chiang Mai University, Chiang Mai, 50200 Thailand
Email: vannajan@gmail.com
Computer Simulation and Modeling Laboratory, Chiang Mai University, Chiang Mai, 50200 Thailand
Tel. 66-53-943341-5 ext. 117 Fax. 66-53-892277
Abstract:
Virtual screening of chemical databases is now a well-established method for finding new hit candidates in the drug discovery process. In this present paper, we have developed an integrated
approach incorporating the novel descriptor which included the amino acid residues in the binding pocket from molecular docking and the conventional descriptors generated using JOElib (a platform
independent open source computational chemistry package written in Java) comparing with those generated from MOE (molecular operating environment) program for the HIV-1 protease and its inhibitors.
Better predicted activity was found when adding this novel descriptor. A system for support of In silico drug screening are developed at Biomolecular Modeling and In Silico Screening Center
http://chemoinfo.science.cmu.ac.th. This system provided the tool to evaluate the binding predictive model, qualitative structure-activity relationship (QSAR), descriptor-based similarity, binding free energy
prediction, and correlation between binding free energy and inhibitory activity, helps users interprets their hypotheses or experimental results in application of virtual screening for lead compounds.
:3D protein structure prediction and evaluation
Features:
Protein
:3D protein structure prediction and evaluation
-3D-MD automated protein structure in water
(implicited and explicited models)
-3D modeller structure
:Protein sequence search
Protein-Ligand Interaction
:Protein-inhibitors structure prediction
-Autodock (HIV-1 Protease, SARS, H5N1)
-Autodock + novel descriptor BDT (HIV-1 Protease, SARS, H5N1)
The rough 3D model based on multiple templates was constructed by using academic version 8.1 of
MODELLER. The model with the lowest objective function was chosen for the best proposed model.
Subsequently, the model was chosen for further refinements to include all missing atoms especially
hydrogen atom by energy minimization and to include the solvent effect by molecular dynamics (MD)
simulation with implicit and explicit solvent model. For the implicit model, generalized Born (GB) model
at 300 K for 500 ps with a time step of 2.0 fs using AMBER 8.0 force field parameter03 was performed.
For the explicit solvent model, the model was solvated by water molecules (6 Å from protein surface
by default) in a cubic cell under periodic boundary conditions. Sodium and chloride ions were added to
neutralize the system. Then, temperature of the system was gradually raised to 298 K for the first 60
ps and kept at this temperature for the total simulation time of 1 ns. Final 3D-MD protein structure
from 1 ns MD simulation was used as the predicted equilibrium structure. The overall quality of refined
model was evaluated by PROCHECK for the evaluation of Ramachandran plot quality, PROSA for
testing interaction energies and VERIFY3D for assessing the compatibility of each amino acid residue.
For a reliable model, the score in PROSA should be under 0 (score < 0) and the score in VERIFY3D
should be more than 0.1 (score > 0.1).
HIV-1 Protease
Ligand
:Molecular similarity search in Thailand natural products database
:Drug-likeness
:Screening for potential drugs
-QSAR properties (JOElib + BDT) of drug in each family
-Classification the type of inhibitor from QSAR
-Statistics screening
-Activity prediction (QSAR)
*under development
Neuraminidase (H5N1)
SARS-CoV
:Protein-Inhibitors structure prediction
Molecular Docking
:In Silico screening for potential drugs
The descriptors of each inhibitor were generated from novel Binomial Distribution Test
(BDT) analysis in combination with JOElib (a platform independent open source
computational chemistry package written in Java) comparing with those generated
from MOE (molecular operating environment). The native values from JOELIB like
simple atom/group counts and complexity measures are used to build primitive
QSAR prediction models for structure property relations.
MOE
230 PARAMETERS, 10 Pcs
JOELIB
47 PARAMETER, 8 Pcs
JOELIB AND BDT
51 PARAMETER, 6 Pcs
Each structure of HIV-1Pr inhibitor was obtained by docking an inhibitor to HIV-1Pr. HIV-1Pr was kept
rigid and Gasteiger charges were used. Grid maps have been calculated using the module AutoGrid in
AutoDock 3.0 program for protease structure. The center of grid was assigned at the center of mass of
the enzyme. The number of grid points in x y z, is 60 60 60 with the spacing of 0.375 Å. This parameter
set covers the active site extensively and let the ligand move and explored the enzyme active site
without any constraints regarding the box size. The inhibitor was positioned in the active site of HIV-1 Pr
in many different ways using Lamarckian genetic algorithm (LGA). The solvation effect was also
included in this docking study. Each hydrogen bonded to N and O atoms in the hydrogen grid maps
was calculated using self-consistent hydrogen bonding 12-10 parameters in the AutoGrid.
Molecular Docking + Novel descriptor (BDT)
The enzyme-inhibitor complexes from 100 runs of molecular docking were collected in order to explore
all probable binding structures. The vicinity residues in the binding pocket, within 3 Å measured from
the inhibitor, were selected as vital amino acids for enzyme-inhibitor complex formation. The lower
quartile structures (25%) were taken into consideration here. The presence or absence of amino acid
was treated as binomial variable x. In this case, the probability of a presence or absence of amino acid
+
+
was assumed to be equal. P value, P (x) and P (x), for the observed number of presence (r ) and
absence (r-) can be calculated directly from this binomial distribution:
+
P (x) =
Correlation between predicted and experimental pKi from MOE, JOELIB, and JOELIB + BDT.
Classification the type of
inhibitor from QSAR
All 38 QSAR properties with the inhibitors type were tested
in a univariate analysis. The 32 properties found to be
independently associated with inhibitors type. However,
the number of cases (36 cases or structures) is too small
for testing with 32 properties. Therefore, the principal
component technique was used to define new combination
of properties. Classification the type of the inhibitor can be
shown from the scatter plot of the canonical discriminant
functions
=
-
P (x) =
+
P(x >= r
N
å
r+
when p =1/2)
æ N ö
N-r +
r+
çç + ÷÷ (0 .5 ) (0 .5 )
è r ø
-
P(x >= r when p =1/2)
If P+(x) < a amino acid x will be present. On the other hand, if P-(x) < a amino acid x will be absent.
In this case, a?= 0.005 indicating that if more than 99.5% of cases were found then the presence of
(absence of) the residue will be accepted. Typically, three binding energy terms used in AutoDock 3.0
were included in the score function: the van der Waals interaction represented as a Lennard-Jones
12-6 dispersion/repulsion term, the hydrogen bonding represented as a directional 12-10 term, and the
Coulombic electrostatic potential. So, the binding energy of cyclic urea derivatives with HIV-Pr could be
simply described as the electrostatic, van der Waals, and hydrogen bonding interaction energy,
respectively. On the basis of the traditional molecular force field model of interaction energy, a new
score function at the level of binding free energy was derived and adopted in the version of
AutoDock 3.0. Thus, the new score function was to rank the inhibitors in the different levels of binding
affinities due to the most probability of finding key amino acids in the binding pocket.
Scatter plot of the canonical discriminant functions
Drug-likeness from self-organizing neural network
Data sets:
68 HIV active compounds
54 Thailand Natural products
Descriptors: 96
Algorhithm: Kohonen
Topology: Toroidal
Network:
5x5
¨Ðà¾ÔèÁ alignment structure à»ÃÕºà·Õº dock / dock + bdt
Global Molecular Descriptors (Molecular weight, No. of H bonding acceptors, No. of H bonding donors, Octanol/water
partition (logP), Topological polar surface area, Mean molecular polarizability, Molecular dipole moment, Aqueous solubility
(logS))
2D Autocorrelation Descriptors (Identity, s Charge, p Charge, Total charge, s Electronegativity, p Electronegativity,
Lone pair electronegativity, Effective polarizability)
Autocorrelation of surface properties (Electrostatic potential, Hydrogen bonding potential, Hydrophobicity potential)
References
Acknowledgements
: Prof. Dr. Supot Hannongbua, In Silico Screening Network,
The Thailand Research Fund (TRF) Senior Scholar
: Dr. Chak Sangma for Thailand Natural Products Database
: The Thailand Research Fund (DBG48)
th
:Pinmas, P.; Lee, V. S.; Nimmanpipug, P.; Chaijaruwanich, J. “Automated Prediction of Three-Dimensional Protein Structure in Water from Molecular Dynamics Simulation (3D-MD)” Proceeding of the 9 Annual National
Symposium on Computational Science Engineering (ANSCSE9), Mahidol University, Thailand, Mahidol University, Thailand, March 23-25, 2005, 523-527.
:Jitonnom, J.; Lee, V. S.; Nimmanpipug, P.; Kodchakorn, K.; Hannongbua, S. “Refinement of Neuramidase H5N1 Model by Molecular Dynamics Simulations with Implicit Solvent Models”, Proceeding of the 10th Annual
National Symposium on Computational Science Engineering (ANSCSE10), Chiang Mai University, Thailand, March 22-24, 2006, 113-117..