Use cases for the XML format for the annotations and their
display properties in the task files (see "Creating a New Task") are described in
this document. The reference document is found here. Click here for a split-screen view.
This documents focuses exclusively on the definition of
annotations and their display properties. There are two subsystems
that provide this capability. The <annotations> element and
its children provides the new subsystem, and the
<annotation_set_descriptors> element and its children, and
the <annotation_display> element and its children, provide
the legacy subsystem. Other customizations you can make in your
task.xml file are described separately.
If you don't care about the legacy subsystem, feel free to skip
this section.
The primary differences between the new <annotations>
subsystem and the legacy <annotation_set_descriptors>
subsystem are organizational.
First, in the legacy system, all the elements of each annotation
set descriptor are defined within a single
<annotation_set_descriptor> element. Because attributes can
be defined in different sets than the annotations that bear them,
attributes in the legacy system are defined outside the scope of
the annotations that bear them, which introduced additional
verbosity and made it harder to comprehend the annotation
inventory. In the new system, attributes are defined within the
scope of their annotations, and there's no separate container for
annotation set descriptors; the set and category of the
annotations and attributes is recorded using the "of_set" and
"of_category" XML attributes. This makes it harder to comprehend
the set descriptor structure, but complex annotation attribute
structures are much more common than multiple set descriptors, and
so the new system simplifies the former at the expense of the
latter.
Second, in the legacy system, display properties were defined in
an entirely separate section of the task.xml file, which led to
additional verbosity and introduced opportunities for error. This
separate display section has been eliminated in the new section,
and all display-relevant attributes are prefixed with "d_" for
clarity.
Third, in the new system, we've introduced a number of shorthands
to define spanned vs. spanless annotations, different types of
attributes (<int>, <string>, etc.) and set and list
attributes (<int_set>, <int_list>, etc.). We've also
eliminated some extraneous element levels (<range>, for
instance, is now attributes on its former parent).
Fourth, in the legacy system, the visual menu hierarchy of
annotations was defined in this separate display section. In the
new system, the <display_group> element can be used to
create hierarchies of declarations of annotation types.
Note: if you make use of task inheritance (i.e., if you
use the "parent" attribute on your <task> element in your task.xml file), you may have good
reasons to use the legacy subsystem; it's only in the legacy
subsystem that you can inherit annotation types from the parent
task and add attributes or display behavior to that type.
In the sections below, we'll exemplify both the new and legacy
methods.
The simplest example of customizing your annotations in your
task.xml file is inheriting all your structural annotations and
adding your own content annotations. The role of the different
annotation categories is described here.
New:
<annotations inherit="category:zone,category:token">
<span label="TAG1"/>
<span label="TAG2"/>
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
<annotation label="TAG1"/>
<annotation label="TAG2"/>
</annotation_set_descriptor>
</annotation_set_descriptors>
So here, we've inherited the structure annotations and defined
two content annotations, TAG1 and TAG2. The content annotations
are both spanned annotations, by default.
Not all content annotations are spanned annotations; some
annotations aren't anchored directly to the text. You can find
examples of such annotations in Tutorial
7. It's easy to define these annotations.
New:
<annotations inherit="category:zone,category:token">
...
<spanless label="SPANLESS1"/>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="SPANLESS1" span="no"/>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
The UI effects of defining spanless annotations are described here.
Annotations, spanned or spanless, can have attributes. These
attributes can be strings (the default), floats, integers,
Booleans, or other annotations, or sets or lists of these types.
Here's how to define a simple string attribute.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TAG1">
...
<string name="string_attr"/>
...
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<attribute of_annotation="TAG1" name="string_attr"/>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
String attributes can have default values, or choices.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TAG1">
...
<string name="string_attr" default="Pronoun"
choices="Pronoun,Nominal,Proper name"/>
...
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<attribute of_annotation="TAG1" name="string_attr" default="Pronoun">
<choice>Pronoun</choice>
<choice>Nominal</choice>
<choice>Proper name</choice>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
Integer and float attributes can be defined with accepted ranges.
New:
<annotations inherit="category:zone,category:token">
...
<span name="TAG1">
...
<int name="int_attr" range_from="10" range_to="20"/>
...
</span>
...
</annotations>
Legacy
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<attribute of_annotation="TAG1" type="int" name="int_attr">
<range from="10" to="20"/>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
Annotation attributes must have label restrictions that specify
what types of annotations can fill this attribute value (more
examples here).
New:
<annotations inherit="category:zone,category:token">
...
<span label="TAG1">
...
<filler name="annot_attr" filler_types="TAG2"/>
...
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<attribute of_annotation="TAG1" type="annotation" name="annot_attr">
<label_restriction label="TAG2"/>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
And any of these attributes can be set or list aggregations.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TAG1">
...
<filler_set name="mentions" filler_types="TAG2"/>
...
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<attribute of_annotation="TAG1" type="annotation" aggregation="set" name="mentions">
<label_restriction label="TAG2"/>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
In some situations, you may want to define a single content
annotation, which has a distinguished attribute value. One common
example of this in language processing arises in tagging for
so-called named entities (people, locations, organizations). One
common tagging scheme assigns a single ENAMEX tag to these
entities, and distinguishes among them using the value of the
"type" attribute. This label + attribute/value pair is assigned a
notional name, for use in the UI, scorer, etc. We call these effective
labels.
Effective labels must be defined on choice restrictions of string
or integer attributes. If an effective label is declared for one
of the choices, there must be a declaration for all of them. In
other words, the choices must completely partition the label.
New:
<annotations inherit="category:zone,category:token">
...
<span label="ENAMEX">
...
<effective_labels name="type">
<label label="PERSON" value="PER"/>
<label label="ORGANIZATION" value="ORG"/>
<label label="LOCATION" value="LOC"/>
</effective_labels>
...
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="ENAMEX"/>
<attribute of_annotation="ENAMEX" name="type">
<choice effective_label="PERSON">PER</choice>
<choice effective_label="ORGANIZATION">ORG</choice>
<choice effective_label="LOCATION">LOC</choice>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
You can define complex restrictions on annotation-valued
attributes in a number of ways. These restrictions consist of a
label and its attributes; the attributes must be choice attributes
(i.e., string or integer attributes with choices defined). The
availability of these restrictions is independent of whether an
effective label is defined for the attribute.
Here's an example fragment. It starts with the effective label
attribute definition from the previous example, but defines a
second (nonsensical) integer choice attribute.
New:
<annotations inherit="category:zone,category:token">
...
<span label="ENAMEX">
...
<effective_labels name="type">
<label label="PERSON" value="PER"/>
<label label="ORGANIZATION" value="ORG"/>
<label label="LOCATION" value="LOC"/>
</effective_labels>
<int name="size" choices="0,1"/>
...
</span>
<span label="LOCATED">
<filler name="who">
<filler_type label="ENAMEX" type="PER" size="1"/>
</filler>
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="ENAMEX"/>
<attribute of_annotation="ENAMEX" name="type">
<choice effective_label="PERSON">PER</choice>
<choice effective_label="ORGANIZATION">ORG</choice>
<choice effective_label="LOCATION">LOC</choice>
</attribute>
<attribute of_annotation="ENAMEX" type="int" name="size">
<choice>0</choice>
<choice>1</choice>
</attribute>
<!-- and now, the annotation-valued attribute -->
<annotation label="LOCATED"/>
<attribute of_annotation="LOCATED" name="who" type="annotation">
<label_restriction label="ENAMEX">
<attributes type="PER" size="1"/>
</label_restriction>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
The label restriction itself can refer either to a true or an
effective label, and the effective label can be combined with
additional attribute restrictions, as long as those attributes are
choice attributes:
New:
<span label="LOCATED">
<filler name="who">
<filler_type label="PERSON" size="1"/>
</filler>
</span>
Legacy:
<annotation label="LOCATED"/>
<attribute of_annotation="LOCATED" name="who" type="annotation">
<label_restriction label="PERSON">
<attributes size="1"/>
</label_restriction>
</attribute>
As you can see from the previous example, you can use
annotation-valued attributes and label restrictions to create
relations among annotations. This is the only facility that MAT
provides for making these connections. We acknowledge that this
approach has limitations:
So, for instance, how might you represent an array of time
restrictions (before, after, etc.) on an event? Here are three
different strategies.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<span label="EVENT">
<!-- if you're anticipating multiple times for a restriction type,
make these set aggregations -->
<filler name="before" filler_types="TIME"/>
<filler name="after" filler_types="TIME"/> </span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<annotation label="EVENT"/>
<!-- if you're anticipating multiple times for a restriction type,
make these set aggregations -->
<attribute of_annotation="EVENT" type="annotation" name="before">
<label_restriction label="TIME"/>
<attribute of_annotation="EVENT" type="annotation" name="after">
<label_restriction label="TIME"/> ...
</annotation_set_descriptor>
</annotation_set_descriptors>
The obvious problem with this strategy is that you might have
many, many temporal relations you care about, and/or you may want
to provide attributes for the temporal relations.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<span label="EVENT"/>
<!-- so can this annotation --> <span label="BEFORE">
<filler name="event" filler_types="EVENT"/>
<filler name="time" filler_types="TIME"/>
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<annotation label="EVENT"/>
<!-- so can this annotation --> <annotation label="BEFORE"/>
<attribute of_annotation="BEFORE" type="annotation" name="event">
<label_restriction label="EVENT"/>
</attribute>
<attribute of_annotation="BEFORE" type="annotation" name="time">
<label_restriction label="TIME"/>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
You could generalize this strategy by having a single temporal
relation with an attribute to indicate what kind of temporal
relation it is.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<span label="EVENT"/>
<!-- so can this annotation --> <span label="TEMPORAL"/>
<filler name="event" filler_types="EVENT"/>
<filler name="time" filler_types="TIME"/>
<string name="type" choices="BEFORE,AFTER,..."/>
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<annotation label="EVENT"/>
<!-- so can this annotation --> <annotation label="TEMPORAL"/>
<attribute of_annotation="TEMPORAL" type="annotation" name="event">
<label_restriction label="EVENT"/>
</attribute>
<attribute of_annotation="TEMPORAL" type="annotation" name="time">
<label_restriction label="TIME"/>
</attribute>
<attribute of_annotation="TEMPORAL" name="type">
<choice>BEFORE</choice>
<choice>AFTER</choice>
...
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
The obvious problem with this strategy is that the temporal
relations are separated from the events they modify (because
there's no way of showing or representing relations as subordinate
attributes).
This strategy is a combination of the first two.
New:
<annotations inherit="category:zone,category:token">
...
<span label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<span label="EVENT">
<filler_set name="temporal" filler_types="TEMPORAL"/>
</span>
<!-- so can this annotation --> <span label="TEMPORAL"/>
<filler name="time" filler_types="TIME"/>
<string name="type" choices="BEFORE,AFTER,..."/>
</span>
...
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content" category="content">
...
<annotation label="TIME"/>
<!-- this annotation can be spanned or spanless -->
<annotation label="EVENT"/>
<!-- so can this annotation --> <annotation label="TEMPORAL"/>
<attribute of_annotation="TEMPORAL" type="annotation" name="time">
<label_restriction label="TIME"/>
</attribute>
<attribute of_annotation="TEMPORAL" name="type">
<choice>BEFORE</choice>
<choice>AFTER</choice>
...
</attribute>
<attribute of_annotation="EVENT" type="annotation" aggregation="set" name="temporal">
<label_restriction label="TEMPORAL"/>
</attribute>
...
</annotation_set_descriptor>
</annotation_set_descriptors>
The distinction here is subtle: instead of TEMPORAL being a
two-place relation between an event and a time, it's got only one
argument, the time, and its relation to the EVENT is represented
by the its presence in the "temporal" set-aggregation
annotation-valued attribute.
The obvious disadvantage to this strategy is that it doesn't
correspond trivially to what we'd think of as the "correct event
logic". However, given that we're talking about annotations, not
objects in a knowledge representation, it might ultimately be the
proper compromise.
Every task in MAT has multiple annotation sets
defined: at the very least, the task will define the "admin" set,
which contains the SEGMENT
annotation. And all the examples up to this point inherit zone and
token annotation set categories from the root task. However, all
the examples so far also have defined a single content annotation
set (i.e., a set which is neither admin, zone, or token).
But there's no reason you have to do this; defining multiple
annotation sets is absolutely trivial. Here's a simple example.
New:
<annotations inherit="category:zone,category:token">
<span label="TAG1" of_set="content1"/>
<span label="TAG2" of_set="content2"/>
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="content1" category="content">
<annotation label="TAG1"/>
</annotation_set_descriptor>
<annotation_set_descriptor name="content2" category="content">
<annotation label="TAG2"/>
</annotation_set_descriptor>
</annotation_set_descriptors>
As you can see, in the new subsystem, it's no more complicated
than setting the of_set attribute (and of_category, if it's other
than "content"); and in the legacy subsystem, it's no more
complicated than placing the annotations and attributes in
different <annotation_set_descriptor> elements. Note that
definition order is important; you can't refer to an annotation
label (say, in a <label_restriction> in the legacy subsystem
or the <filler_types> in the new subsystem) before it's
defined.
There are many reasons you may not want all your content
annotations in the same set. For instance, you may want to
structure your annotation task so that all your mentions are
annotated before any of your relations are added. You may also
have multiple engines you're trying to use, each of which works on
a different subset of your annotations. In these cases, you might
want multiple annotation steps; and since steps are linked to
annotation types at the granularity of annotation sets, if you
have multiple annotation steps, you'll want multiple
annotation sets.
Note that there are reasons to have multiple annotation sets even
if you don't map them to multiple steps. You may have a single
annotation step, which adds multiple sets, and in this hand
annotation step in the UI, you'll now have multiple active
annotation sets, any of which you can disable or hide as a set,
rather than annotation by annotation.
One useful feature of the way annotation types are defined in MAT
is that attributes can be placed in annotation sets other than the
set of the annotations that bear them. In the legacy subsystem,
attributes are intentionally defined outside the scope of
the annotation in order to emphasize this; in the new subsystem,
it's managed via XML attributes. For instance, perhaps you want
your event tagging workflow to separate finding the event spans
from connecting them with their arguments. You could do this:
New:
<annotations inherit="category:zone,category:token">
<span label="TAG1" of_set="mentions"/>
<span label="TAG2" of_set="mentions"/>
<span label="EVENT1" of_set="events">
<filler name="arg1" of_set="args" filler_types="TAG1"/>
<filler name="arg2" of_set="args" filler_types="TAG2"/> </span>
</annotations>
Legacy:
<annotation_set_descriptors inherit="category:zone,category:token">
<annotation_set_descriptor name="mentions" category="content">
<annotation label="TAG1"/>
<annotation label="TAG2"/>
</annotation_set_descriptor>
<annotation_set_descriptor name="events" category="content">
<annotation label="EVENT1"/>
</annotation_set_descriptor>
<annotation_set_descriptor name="args" category="content">
<attribute of_annotation="EVENT1" name="arg1" type="annotation">
<label_restriction="TAG1"/>
</attribute>
<attribute of_annotation="EVENT1" name="arg2" type="annotation">
<label_restriction="TAG2"/>
</attribute> </annotation_set_descriptor>
</annotation_set_descriptors>
In the UI, when you enter a step which adds the "events" set,
you'll be able to add, delete, and modify the extent of event
spans, but not edit their attributes, and when you enter a step
which adds the "args" set, you'll be able to edit the attributes
of your event spans, but not delete them, modify their extents, or
add new ones.
The simplest way of customizing the UI behavior of your content
annotations is to assign some CSS to distinguish them in the Web
UI. The most appropriate way to do this is to use background
colors; using font weight or style, or text color, as the sole
distinguishing feature will fail in document
alignment, comparison, and
reconciliation for span
annotations, and will always fail for spanless annotations, which
don't cover any text.
New:
<annotations>
...
<span label="TAG1" d_css="background-color: blue"/>
<span label="TAG2" d_css="background-color: green"/>
...
</annotations>
Legacy:
<annotation_display>
...
<label name="TAG1" css="background-color: blue"/>
<label name="TAG2" css="background-color: green"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 and TAG2 are defined as true or effective labels in your task.)
We've assigned TAG1 a blue background color, and TAG2 a red
background color. Since you're using CSS, you can assign colors
using hexadecimal designations as well (or, if you prefer, set a
background image, or other wacky things).
One caveat: at the moment, annotation spans are styled on a
token-by-token basis. So if, for instance, you want to have a left
bracket at the left end of an annotation, and a right bracket at
the right end, you can't do that quite yet; you'd end up with each
token bracketed.
Sometimes the color you choose is too dark to see the text, in
which case you can use CSS to change the text color.
New:
<annotations>
...
<span label="TAG1" d_css="background-color: red; color: white"/>
...
</annotations>
Legacy:
<annotation_display>
...
<label name="TAG1" css="background-color: red; color: white"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 is defined as a true or effective label in your task.)
Remember, the value of the css/d_css attribute is really CSS;
it's not converted or processed in any way before it's inserted
into the CSS rules in the Web UI. The one caveat is that the CSS
is applied to each token in the annotated phrase, not to the
phrase as a whole.
If you're marking longer spans and shorter spans (say, paragraphs
or sentences on the one hand, and entity mentions on the other),
you might want to ensure that the entity mentions always appear
"in front" of the longer spans when annotations aren't stacked. By default, the order in which annotation display elements are defined in your task.xml file is the back-to-front overlap order of the annotations in the UI:
New:
<annotations>
...
<span label="PARAGRAPH" d_css="background-color: blue"/>
<span label="PERSON" d_css="background-color: red"/>
...
</annotation_display>
Legacy:
<annotation_display>
...
<label name="PARAGRAPH" css="background-color: blue"/>
<label name="PERSON" css="background-color: red"/>
...
</annotation_display>
(In the legacy case, we assume that PARAGRAPH and PERSON are defined as true or effective labels in your task.)
However, you may be relying on the order of definition of the annotations to be the order that the annotations appear in the legend and the annotation popup menu, and you may not want those orders to be the same. There are two ways to control this order.
First, as of MAT 3.1, you can use the new overlap_rank attribute to control the back-to-front order explicitly. All labels which lack the overlap_rank display attribute will be ordered back-to-front in the order they're defined, and in front of them, all labels which have an overlap_rank attribute will be ordered back-to-front in the order of their overlap_rank attribute. So in this case below, the overlap rank will be, back-to-front, "SECTION", "PARAGRAPH", "LOCATION", "PERSON":
New:
<annotations>
...
<span label="PERSON" d_css="background-color: red" d_overlap_rank="2"/>
<span label="SECTION" d_css='background-color: pink"/>
<span label="PARAGRAPH" d_css="background-color: blue"/>
<span label="LOCATION" d_css="background-color: green" d_overlap_rank="1"/>
...
</annotation_display>
Legacy:
<annotation_display>
...
<label name="PERSON" css="background-color: red" overlap_rank="2"/>
<label name="SECTION" css='background-color: pink"/>
<label name="PARAGRAPH" css="background-color: blue"/>
<label name="LOCATION" css="background-color: green" overlap_rank="1"/>
...
</annotation_display>
Second, as of MAT 3.2, you can use the "background_span" value of the new rendering_style attribute to force some labels to (a) never be stacked, and (b) appear behind all spans which do not have this attribute value. E.g.:
New:
<annotations>
...
<span label="PERSON" d_css="background-color: red"/>
<span label="SECTION" d_css='background-color: pink" d_rendering_style="background_span"/>
<span label="PARAGRAPH" d_css="background-color: blue" d_rendering_style="background_span"/>
<span label="LOCATION" d_css="background-color: green"/>
...
</annotation_display>Legacy:
<annotation_display>
...
<label name="PERSON" css="background-color: red"/>
<label name="SECTION" css='background-color: pink" rendering_style="background_span"/>
<label name="PARAGRAPH" css="background-color: blue" rendering_style="background_span"/>
<label name="LOCATION" css="background-color: green"/>
...
</annotation_display>
In this example, the PERSON label will occur behind the LOCATION label, and the SECTION label will occur behind the PARAGRAPH label, and both the SECOND and PARAGRAPH labels will occur behind the PERSON and LOCATION labels. In addition, the SECTION and PARAGRAPH labels will never be stacked, no matter how the stacking in the UI is configured.
These two back-to-front ordering control options interact with each other properly; so you can use overlap_rank to explicitly control the order within the background spans, and within the other spans.
By default, the order in which labels are defined does not control the order in which they appear in the legend and the annotation popup menu. Instead, the default behavior is for the labels to appear in alphabetical order. The motivation for this default is that as your label set grows, it becomes harder and harder to easily find the label you're interested in, so imposing a default order that the task definer doesn't have to think about is useful. However, this may not be useful behavior for some tasks; either there may be an intuitive non-alphabetical order, or the task may define cascades for its annotation popup menu. In this case, you can force the UI to respect the definition order in the task, as follows:
<task name="My task">
...
<web_customization alphabetize_labels="no"/>
...
</task>
Note that if you do this, you may need to pay additional attention to the back-to-front order of spanned labels.
The display element also supports the option of having keyboard
accelerators. These are keys that the user can press when the
tagging menu is visible in the UI, which are equivalent to having
selected that menu item. You can add an accelerator using an
attribute on the element.
New:
<annotations>
...
<span label="TAG1" d_accelerator="1" d_css="background-color: blue"/>
<span label="TAG2" d_accelerator="2" d_css="background-color: green"/>
...
</annotations>
Legacy:
<annotation_display>
...
<label name="TAG1" accelerator="1" css="background-color: blue"/>
<label name="TAG2" accelerator="2" css="background-color: green"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 and TAG2 are defined as true or effective labels in your task.)
Spanless annotations will pop up an annotation editor immediately
when they're created. If you want this behavior for spanned
annotations, you can specify it.
New:
<annotations>
...
<span label="TAG1" d_css="background-color: red" d_edit_immediately="yes"/>
...
</annotations>
Legacy:
<annotation_display>
...
<label name="TAG1" css="background-color: red" edit_immediately="yes"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 is defined as a true or effective label in your task.)
When an annotation is described in the UI (e.g., as an annotation
attribute value in the annotation tables or annotation editors, or
in the title bar of an annotation editor), it has a default
presentation, which, for spanned annotations, is the string the
span covers, and for spanless annotations, is the label and
annotation index. There are many times when you might want a
different default presented name, e.g., if the label tends to
traverse enormous spans. Here's an example of using the
presented_name attribute to truncate the text.
New:
<annotations>
...
<span label="TAG1" d_css="background-color: red" d_presented_name="$(_text:truncate=20)"/>
...
</annotations>
Legacy:
<annotation_display>
...
<label name="TAG1" css="background-color: red" presented_name="$(_text:truncate=20)"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 is defined as a true or effective label in your task.)
There's a fairly extensive formatting language which is available
for the presented_name attribute; see the annotation XML reference for
details.
Sometimes string attribute values are expected to be short, and
sometimes long. By default, non-choice string attribute values get
a short typein window in the annotation editor. If you want to
make it a long string, you can do it.
New:
<annotations>
...
<span label="TAG1">
<string name="attr1" d_editor_style="long_string"/>
</span>
...
</annotations>
Legacy:
<annotation_display>
...
<attribute name="attr1" of_annotation="TAG1" editor_style="long_string"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 is defined as a true or effective label in your task, with attr1 as one of its attributes.)
You may have a string attribute which is actually a date, which
you want to use a calendar widget to populate; or you might want
to look up the annotation text in a database, and use the results
to populate the attribute value. If you're willing to do some
programming, there's a way to associate an arbitrary JavaScript
function with a string, int, or float attribute, to use as its
editor.
New:
<web_customization>
<js>js/yourTaskCustomizations.js</js>
</web_customization>
<annotations>
...
<span label="TAG1">
<string name="attr1" d_custom_editor="yourJavascriptFunction"/>
</span>
...
</annotations>
Legacy:
<web_customization>
<js>js/yourTaskCustomizations.js</js>
</web_customization>
<annotation_display>
...
<attribute name="attr1" of_annotation="TAG1" custom_editor="yourJavascriptFunction"/>
...
</annotation_display>
(In the legacy case, we assume that TAG1 is defined as a true or effective label in your task, with attr1 as one of its attributes.)
You can define your function in your task directory in
js/yourTaskCustomizations.js (or whatever name you choose), and
this file will be loaded when the UI is loaded. Unfortunately, we
don't really have the resources to document the API this function
has to conform to; you can find the code which governs this
capability in the CustomEditorCellDisplay implementation in
MAT_PKG_HOME/web/htdocs/js/mat_doc_display.js.
Let's say that you've defined the following annotations.
New:
<annotations>
<span label="PERSON"/>
<span label="MAN"/>
<span label="WOMAN"/>
<span label="US-LOCATION"/>
<span label="FOREIGN-LOCATION"/>
</annotations>
Legacy:
<annotation_set_descriptors>
<annotation_set_descriptor name="content" category="content">
<annotation label="PERSON"/>
<annotation label="MAN"/>
<annotation label="WOMAN"/>
<annotation label="US-LOCATION"/>
<annotation label="FOREIGN-LOCATION"/>
</annotation_set_descriptor>
</annotation_set_descriptors>
Your annotator is instructed to label people, using PERSON as the
annotation if she can't tell which of MAN or WOMAN is applicable.
Your preference is to arrange these in a visual hierarchy for the
annotator's convenience; you wish to do the same with US-LOCATION
and FOREIGN-LOCATION, even though they don't have a common, less
specific annotation.
New:
<annotations>
<display_group>
<span label="PERSON" d_css="background-color: blue"/>
<span label="MAN" d_css="background-color: pink"/>
<span label="WOMAN" d_css="background-color: green"/>
</display_group>
<display_group name="LOCATION">
<span label="US-LOCATION" d_css="background-color: gray"/>
<span label="FOREIGN-LOCATION" d_css="background-color: red"/>
</display_group>
</annotations>
Legacy:
<annotation_set_descriptors>
<annotation_set_descriptor name="content" category="content">
<annotation label="PERSON"/>
<annotation label="MAN"/>
<annotation label="WOMAN"/>
<annotation label="US-LOCATION"/>
<annotation label="FOREIGN-LOCATION"/>
</annotation_set_descriptor>
</annotation_set_descriptors>
<annotation_display>
...
<label name="PERSON" css="background-color: blue"/>
<label name="MAN" css="background-color: pink"/>
<label name="WOMAN" css="background-color: green"/>
<label name="US-LOCATION" css="background-color: gray"/>
<label name="FOREIGN-LOCATION" css="background-color: red"/>
...
<label_group name="PERSON" children="MAN,WOMAN"/>
<label_group name="LOCATION" children="US-LOCATION,FOREIGN-LOCATION"/>
...
</annotation_display>
In the legacy case, the label group can reference an existing
annotation (as in the PERSON case) or create its own group (as in
the LOCATION case); in the new case, the display group either has
its own name and all its children are children in the group, or it
can treat the first annotation in the group as the parent. The
effect of these groups will be to create submenus in the
annotation popup in the MAT UI.
Note that in all cases, the actual labels which appear in this construction must be styled. The whole point of this construction is to create cascades for
annotation menus, and if the label isn't styled, it doesn't appear
in the annotation menu at all. As a result, it's an error for a referenced label not to be styled.
Also note that if you set up annotation menu cascades, you might want to ensure that your labels are not alphabetized.
If you have effective labels, the contract between the original and legacy cases is a bit more extensive. Let's say you want to make a cascade out of a set of effective labels:
New:
<annotations>
<span label="ENAMEX">
<effective_labels name="type">
<label label="PERSON" value="PER" d_css="background-color: blue"/>
<label label="ORGANIZATION" value="ORG" d_css="background-color: pink"/>
<label label="LOCATION" value="LOC" d_css="background-color: green"/>
<display_group/>
</effective_labels>
</span>
</annotations>
Legacy:
<annotation_set_descriptors>
<annotation_set_descriptor name="content" category="content">
<annotation label="ENAMEX"/>
<attribute of_annotation="ENAMEX" name="type">
<choice effective_label="PERSON">PER</choice>
<choice effective_label="ORGANIZATION">ORG</choice>
<choice effective_label="LOCATION">LOC</choice>
</attribute>
</annotation_set_descriptor>
</annotation_set_descriptors>
<annotation_display>
<label name="PERSON" css="background-color: blue"/>
<label name="ORGANIZATION" css="background-color: pink"/>
<label name="LOCATION" css="background-color: green"/>
<label_group name="ENAMEX" children="PERSON,ORGANIZATION,LOCATION"/>
</annotation_display>
In the new case, the <display_group> element within an <effective_label> element declares that the effective labels form a cascade beneath the parent span label; the <display_group> element in this case can also be named and styled, as it can in the main case, but it can have no children. What this means is that in the legacy case, it's possible to have an effective label as the parent of a cascade subtree, but in the new case, it's only possible to have effective labels at the leaves of the cascade.