Sortal Anaphora Dataset

In order to develop and evaluate a sortal anaphora resolution module, we annotated a corpus of 320 MEDLINE citations with pairwise sortal anaphora relations consisting of the anaphoric expressions and their correspondent antecedents. Since we aimed at a general approach that takes into account all the semantic types and consequently supports SemRep, we collected MEDLINE abstracts on a wide range of topics, including molecular biology and clinical medicine.

For further details, please refer to our BMC Bioinformatics paper Sortal anaphora resolution to enhance relation extraction from biomedical literature.

To access these files, users must have accepted the terms of the UMLS Metathesaurus License Agreement, which requires licensees to respect the copyrights of the constituent vocabularies and to file a brief annual report on their use of the UMLS. They must also have activated a UMLS Terminology Services (UTS) account. For information on the use of UTS authentication, please click here.

For details on the licenses, please see the UMLS Metathesaurus License Agreement and How to License and Access the Unified Medical Language System (UMLS) Data.

Sortal Anaphora dataset:

Sortal Anaphora Dataset