Subtask A: Entity recognition¶
Given a list of eHealth documents written in Spanish, the goal of this subtask is to identify all the entities per document and their types. These entities are all the relevant terms (single word or multiple words) that represent semantically important elements in a sentence. The following figure shows the relevant entities that appear in a set of example sentences.
Note that some entities ("vías respiratorias" and "60 años") span more than one word. Entities will always consist of one or more complete words (i.e., not a prefix or a suffix of a word), and will never include any surrounding punctuation symbols, parenthesis, etc. There are four types for entities:
- Concept: indentifies a relevant term, concept, idea, in the knowledge domain of the sentence.
- Action: indentifies a process or modification of other entities. It can be indicated by a verb or verbal construction, such as "afecta" (affects), but also by nouns, such as "exposición" (exposition), where it denotes the act of being exposed to the Sun, and "daños" (damages), where it denotes the act of damaging the skin. It can also be used to indicate non-verbal functional relations, such as "padre" (parent), etc.
- Predicate: identifies a function or filter of another set of elements, which has a semantic label in the text, such as "mayores" (older), and is applied to an entity, such as "personas" (people) with some additional arguments such as "60 años" (60 years).
- Reference: identifies a textual element that refers to an entity --of the same sentence or of different one--, which can be indicated by textual clues such as "esta", "aquel", etc.