Document reconciliation

When you compare or align documents, you merely view a variety of annotation sets associated with a common signal. You can also reconcile the conflicts among these sets, and produce a new document reflecting your decisions.

Note:  MAT 2.0 supported only span annotations in reconciliation.  MAT 3.0 features a new, more general reconciliation approach, tightly coupled with our approaches to comparison and scoring.  This replaces the old reconciliation strategy.  The crucial difference between the two is that reconciliation is now based upon annotation pairings as determined by the comparison algorithm, rather than annotation segments.  

The reconciliation process

In order to understand the reconciliation UI, it's important to understand something first about the reconciliation process itself.

The basis of the reconciliation process is the reconciliation document. A reconciliation document is a rich annotated document which is created from a set of input annotated documents. This document contains all the annotations from the input annotated documents, but also contains information how those annotations have been paired by the comparison algorithm and, as the file is reconciled, will store the information about which annotation from each pairing (or none of them) has been selected by the person doing the reconciliation as the correct annotation.  A pairing will typically contain a candidate annotation from each copy of the document, though it may contain fewer if some of the documents don't seem to have any annotation that corresponds with the others in the pairing.

The process of reconciliation is, essentially, a process of voting. In the UI, the reviewer is presented with a table of pairings and makes her choice.  The reviewer may also indicate that none of the annotations belongs in the document, may modify one of the options if none of the options is exactly correct, or may split the pairing if more than one of the alternatives is a correct annotation. 

If you are reconciling a document which contains both spanned and spanless annotations, or annotations which have annotation-valued attributes, the reconciliation will proceed according to the stratification requirements of the comparison algorithm: all annotations which are "pointed to" will be reconciled before the annotations that "point to" them.  Spanned and spanless annotations will always be reconciled in separate "steps", but if you defined an explicit stratification via the definition of similarity profiles, reconciliation will follow that stratification.  (If your task has multiple similarity profiles, you will have the opportunity to specify which one to use when you first generate the reconciliation document.) 

The final step of reconciliation is export. If a reconciliation document has been completely reconciled, you can export it, via the UI, to a new window or document which is a normal, rich annotated document, which reflects all the decisions which have been made during the reconciliation process.

You can also save reconciliation documents at any point in the process and reload them later.

Selecting documents for reconciliation

In the MAT desktop, select "File" in the top menubar, and then "Reconcile files...". You'll be presented with a dialog:

Reconciliation Dialog

This dialog works very similarly to the document comparison dialog; you must select a task, language and similarity profile, and then either select loaded documents or load documents via the "Load document..." option. These documents will be listed in the dialog, and once there's at least two, the "Reconcile" button will be enabled. Like document comparison, the signals of all the documents must match.

Here's an example of a populated dialog:

populated reconciliation dialog

Unlike document comparison, your only option is to remove the document, via the "X" to the left of each listed document; the visual position of the documents is controlled by the reconciliation UI.  The first document added will be used as the reference document for scoring/pairing purposes; however, after the pairing is done, the original reference document is not treated differently in any way. 

You can also invoke document reconciliation in a document comparison window, by selecting "File" -> "Reconcile these documents". In this case, the documents being compared will be the ones selected for reconciliation.

Basic voting

When you press "Reconcile", the MAT server will create a new reconciliation document and present it to you:

reconciliation document

In this document, reconciled (agreed-upon) annotations are shown with "normal" presentation, with the highlighting behind the annotated text. The second occurrence here of "WHO" is an example; all three annotators agreed on this annotation. All annotations that are reconciled are shown in this way, whether they were reconciled from the start because all the annotators had agreed on the annotation, or they were reconciled via the voting process we are about to describe.

Those annotations which are in conflict are shown with the annotations "stacked" above the text. You can hover your mouse over these conflicting annotation bars and get information about the annotation, just like you can with normally presented annotations.   From any annotation in the text pane, if you click you will get a menu option to scroll to the relevant pairing.  Likewise, clicking on the description of any annotation in the Pairings table will give you the option to scroll to the corresponding annotation in the text pane.  Reconciled pairings have a green check in the lefthand column of the Pairings table, while unreconciled pairings have a red X. 

To vote on a pairing, use the buttons in the "Choices" column of the Pairings table.  You may choose any of the annotations in the pairing as the correct one using the corresponding "Choose" button.  If none of the annotations is a correct annotation, choose "None".  If more than one of the annotations is a distinct, correct annotation, choose the "Split" option which will allow you to choose or reject (by choosing "None") each of the annotations in that pairing independently.  (If there are multiple copies of the same correct annotation, you can choose any one of them; do not split the pairing in this case.)

If there should be an annotation but none of the options is exactly correct, you may choose one to start from and "Modify" it to create a corrected annotation for the pairing.  If you select "Modify" you will be presented with an annotation editor in which you can modify the attributes of the annotation.  You can also swipe any text overlapping the annotation to be modified to change the extent of the annotation.  When you are done modifying the annotation, hit the "Done" button in the annotation editor to mark that pairing as fully annotated. 

Once you have voted, the choices disappear and are replaced by a "Review/Change" button; use that button if at any time you wish to review and/or change your vote.  You may at that point make a different vote, or click the "No Change" button.  (Note that you cannot review or change any pairing that was reconciled from the start due to all annotators having agreed upon it.)  At any time you may also re-open all of the reconciled segments for review using the "Review All" button at the top of the Pairings tab.  If you have any pairings open for review, you may also choose the "Close All" button at the top of the tab to close them all; this is the same as selecting "No Change" in each pairing open for review.

reconciliation document with votes

You may find it convenient to filter the annotations using the menu in the upper left corner of the pairings tab.  It is often helpful while reconciling to view only the remaining unreconciled pairings. 

When you have reconciled all pairings at the current stratum, the "Next Stratum" button will be enabled.  Be sure that you have done any necessary review of your votes at the current level before clicking this button, as you cannot go back. 

When reconciling spanless annotations, they are displayed in the spanless sidebar, in a style analogous to they way spanned annotations are displayed.  Reconciled pairings will show up as a solid block colored in the agreed-upon annotation type's highlight color, with a green check mark in it.  Unreconciled annotations show up as a red X with lines stacked above it corresponding to the different annotations in that pairing.

reconciliation document showing spanless annotations

Keep in mind that because each level corresponds to a document, overlapping spanned annotations within each document will be layered on top of each other in this view. The facilities described here will help you understand what's in each document level.

Menu options

The "Reconciliation" item in the top menubar is activated when viewing a reconciliation document:

Reconciliation Menu

In addition, "File -> Save..." is active, and the "View" menu is changed to reflect the items which are relevant to reconciliation:

The meanings of these elements is as follows:

Auto-advance

If you check "Auto-advance" in the Reconciliation menu, when you complete a stratum, a dialog will pop up telling you that you have completed the stratum and inviting you to move on to the next stratum.  You may choose not to if you wish to go back and review any of your choices in that stratum.  If you choose to go back and review, it is recommended that you use the "Review all" button and then "Close all" when you are done.  If you review the pairings one by one (when all other pairings are reconciled) you will get the popup each time you re-reconcile a pairing (since once again, all the pairings will be reconciled.)  Alternately, you could turn off auto-advance during your review. 

As noted in the dialog, once you move to the next stratum, you cannot go back.

Reconciliation autoadvance dialog

Saving and reloading reconciliation documents

At any point during the reconciliation of a set of documents, you can save the reconciliation document for reloading later.

These documents are always saved as MAT JSON documents; the only difference between these documents and normal MAT JSON documents is that these documents are marked as reconciliation documents and contain metadata about the pairings and votes so far.

To load a saved reconciliation document, use "File" -> "Open file..." in the MAT desktop. In the dialog, select "(reconciliation)" as the workflow/mode; "mat-json" will be automatically selected as the document type, and can't be changed:

open file dialog

Exporting

When all the pairings are reconciled, you can export the reconciliation document to a file or window, via "File -> Export". If the document is not fully reconciled, the UI will tell you this and refuse to export.

By default, the UI will export to a new read-only document window, which you can save if you want. If you want to export to a file, enable the "Export to file" option in the menu before you select "Export", and you'll be able to select an output annotation format, and then a save location.