White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

The computer storage, retrieval and searching of generic structures in chemical patents : the machine-readable representation of generic structures.

Barnard, John Mordaunt (1983) The computer storage, retrieval and searching of generic structures in chemical patents : the machine-readable representation of generic structures. PhD thesis, University of Sheffield.

[img] Text (256636.pdf)

Download (12Mb)


The nature of the generic chemical structures found in patents is described, with a discussion of the types of statement commonly found in them. The available representations for such structures are reviewed, with particular note being given to the suitability of the representation for searching files of such structures. Requirements for the unambiguous representation of generic structures in an "ideal" storage and retrieval system are discussed. The basic principles of the theory of formal languages are reviewed, with particular consideration being given to parsing methods for context-free languages. The Grammar and parsing of computer programming languages, as an example of artificial formal languages, is discussed. Applications of formal language theory to chemistry and information work are briefly reviewed. GENSAL, a formal language for the unambiguous description of generic structures from patents, is presented. It is designed to be intelligible to a chemist or patent agent, yet sufficiently ABSTRACT formaLised to be amenabLe to computer anaLysis. DetaiLed description is given of the facilities it provides for generic structure representation, and there is discussion of its Limitations and the principLes behind its design. A connection-tabLe-based internaL representation for generic structures, caLLed an ECTR <Extended Connection TabLe Representation) is presented. It is designed to represent generic structures unambiguousLy, and to be generated automatically from structures encoded in GENSAL. It is compared to other proposed representations, and its implementation using data types of the programming Language PascaL described. An interpreter program which generates an ECTR from structures encoded in a subset of the GENSAL Language is presented. The principles of its operation are described. Possible applications of GENSAL outside the area of patent documentation are discussed, and suggestions made for further work on the development of a generic structure storage and retrieval system based on GENSAL and ECTRs.

Item Type: Thesis (PhD)
Keywords: Information storage/retrieval
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Identification Number/EthosID: uk.bl.ethos.256636
Depositing User: EThOS Import Sheffield
Date Deposited: 16 Jan 2017 16:11
Last Modified: 16 Jan 2017 16:11
URI: http://etheses.whiterose.ac.uk/id/eprint/15079

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)