- contents -- entities -- signals -- coreference -- relationships -- general guidelines -- recipe -- histopath -- radiology -

- previous -- next -

Annotating entities in text

The basic annotation unit within the CLEF corpus is the entity. Entities refer to real-world objects that are a part of a patient's care and treatment: conditions, drugs, investigations etc. Entities are grounded in the text of CLEF documents. The span of text that refers to an entity is a mention of that entity. An entity may appear several times in the same document. Different mentions may refer to the same entity: "Mr. Jone's tumour... his melanoma... the lump".

Just as the entity is the basic unit of annotation, so marking up entities and mentions is the basic sub-task of the annotation process. In this sub-task, stretches of text are marked as being mentions of an entity of a particular type. A co-reference link may be created between these mentions. This section describes, for each of the entity types, how annotators should map from the surface text to annotation:

Summary

The entities