White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Development and validation of a 3D similarity method for virtual screening

Butler, Daniel (2013) Development and validation of a 3D similarity method for virtual screening. MPhil thesis, University of Sheffield.

Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (5Mb)


A predictive 3D similarity workflow approach has been developed using a set of modular Java computer programs that implement algorithms that aim to capture the key components of a 3D similarity search and aim to incorporate methods that address both the similar property principle and molecular recognition paradigms. This approach will expect as input a single query molecule conformation (at least one conformer is required per molecule) and will identify molecules that are similar to it when compared with a target database of 3D conformations. This workflow is achieved by first mapping each of the molecular conformation’s geometric coordinates, together with atomic property data, to abstract representative models referred to as fuzzy pharmacophore objects. A geometric partitioning approach maps full geometric atomic coordinates to a reduced point representation for a molecule in order to capture the overall global shape of the molecule in relatively few points. This sort of “reduced points” approach for molecular representation was first suggested by (Glick et al., 2002) in the context of Protein active site identification. Pharmacophore classifications are applied to the molecular fragments via mapping of internal constituent group atoms and their properties in order to assign the amount of potential interaction type present. The classifications are Hydrophobic, Aromatic, Acceptor, Donor and Hydrophilic and each atom can be mapped to several of these type definitions. Thus we have assigned a biologically relevant code to each of the fragments. These fuzzy pharmacophore object abstract representations will naturally provide a summary level description of a whole molecule in a relatively small number of geometric points. Two such objects are then aligned to minimise the RMSD between points and the volume and properties overlap is evaluated in order to derive global 3D similarity scores for each alignment. One alignment method is to systematically align representations and is in essence a triangle and tetrahedron matching search technique. The second alignment method is based on graph theory and parameterised maximal common substructure or clique detection is applied to a correspondence graph constructed using two representations, followed by minimal RMSD alignment of the evaluated Bron-Kerbosch cliques with the Kabsch rotation algorithm. This provides an alternative and more efficient approach to systematic alignment since the systematic approach is limited to aligning four points maximum. A volume and property overlap scoring function is used to compare two such fuzzy pharmacophore objects and the resultant Tanimoto coefficient is used for ranking. Initially representations of similar size and with equivalent numbers of points (typically three to six points) are compared and are considered shape searches. Subsequently, objects of different scales and representations are compared in a sub-shape search sense, whereby a smaller object could feasibly be searched for within a larger object. The graph theoretical approach to alignment and clique detection facilitates shape and sub-shape search automatically by including the entire representation or just the cliques in scoring. In principle there are many potential ways to overlay two molecules and the sub-shapes or fragments contained within each molecule. Each alignment can score differently and certain alignment orientations will maximise or minimise certain aspects of the scoring criteria. Hence, several key alignments are feasible between two conformations which may define some or all of each molecule that is biologically active in a given context. An alignment and associated maximal volume and properties overlap score is used to rank order the molecules by normalised similarity. When applied to a target database evaluated similarity measures are used to order the list for proposed biological activity. The overall workflow is thus described as a hybrid shape / properties comparison and fragment based biosteric similarity search. The volume distribution and by implication shape, as well as mass derived pharmacophore feature density overlap scores, are determined and thus this aims to capture both shape and pharmacophore search.

Item Type: Thesis (MPhil)
Keywords: 3D Similarity, Shape, Pharmacophore, Acceptor, Donor, Alignment, volume, properties
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Mr Daniel Butler
Date Deposited: 14 May 2013 14:48
Last Modified: 08 Aug 2013 08:53
URI: http://etheses.whiterose.ac.uk/id/eprint/3942

Actions (repository staff only: login required)