Bawden, David (1978) Substructural analysis techniques for structure - property correlation within computerised chemical information systems. PhD thesis, University of Sheffield.
The work described in this thesis involves a novel method of substructural analysis, with potential application for structure- property correlation and information retrieval within computerised chemical information systems. A review is given of the development of the concept of chemical structure and its representation, its application in computerised chemical information systems, and methods for correlating structure with molecular properties. A method is presented for derivation of structural features, representing the whole structure, from Wiswesser Line Notation (WLN) by computer program. These features are then used as variables in statistical analysis procedures: in this work multiple regression analysis and cluster analysis are used. This procedure allows for a rapid, convenient and thorough analysis of large data-sets. The type of structural features used may be easily varied, allowing for investi- gation of factors such as ring substitution patterns, group interactions, and three-dimensional structure. The method is applicable to sets of diverse or structurally related compounds. Statistical tests of the results enable quantitative testing of hypotheses. Multiple regression analysis allows a direct, quantitative correlation between structure and molecular property, and subsequent property prediction. It is applied to sets of aliphatic, alicyclic aromatic, and heterocyclic compounds, including sets of highly diverse structures. Properties examined include biological effects, toxicty, pK, thermochemical properties, boiling point, solubility, and partition coefficient. Some of these properties are highly dependent upon electronic and steric effects, and hence upon relative position of substituents, and on three-dimensional structure. Highly significant correlations are obtained in all cases, and the potential for property prediction is demonstrated. Cluster analysis is applied to several sets of structures. Intuitively sensible classifications are obtained, and the potential for both property prediction and information retrieval discussed. Since these techniques involve the widely used WLN, relatively simple COBOL programs, and standard statistical packages, they should be applicable within operational environments.
|Item Type:||Thesis (PhD)|
|Academic Units:||The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)|
|Depositing User:||EThOS Import Sheffield|
|Date Deposited:||03 Dec 2012 09:51|
|Last Modified:||08 Aug 2013 08:50|