Frequently asked questions

Installation and configuration

How do I upgrade the version of Java or Python that MAT depends on?

If you upgrade your version of Java or Python, and you want MAT to use the new version, the easiest thing to do is rerun the installer. See the installation instructions for your platform (Unix, MacOS or Windows native).

What MAT can do

How can I incrementally build a corpus and train a jCarafe model?

The various steps involved in this incremental, tag-a-little, learn-a-little loop deserve their own page, and are described here.

How do I use the MAT UI to hand-annotate documents?

You don't need to train a model or automatically tag data to use MAT. You can use the MAT UI as a pure hand annotation tool. Tutorial 1 covers this process pretty well, or, if your task involves more than simple span annotations, check out tutorial 7.

Here's a summary with pointers to more of the details. First, create your task and install it, or use a task you've already defined. Make sure that this task has a hand annotation step, as illustrated in the sample 'Named Entity' task. Then, start the Web server and load the MAT UI. You can either manually load and save individual documents on your local machine, or you can set up a workspace and access it in the UI.

How do I use the MAT scorer and UI to compare the output of multiple annotators or annotation tools?

To score simple span annotations (that is, annotations whose main label is the distinguishing element) against each other, you don't even need to have a task. You can use the scorer with the --content_annotations option.

If you're comparing multiple annotation tools, you can run them outside of MAT and ensure that their output can be read by MAT by identifying or creating a MAT reader. Alternatively, you can write a wrapper for the annotation tool, and create workflows and model configurations in task.xml which use the tool (this is harder, and not really documented).

If you're comparing multiple annotators, you can have them annotate the files in file mode and then compare them using the scorer, or you can use the  assignment capability in the workspaces to have the files multiply-annotated in the context of a workspace. In the future, you'll be able to score and reconcile these multiply-annotated files directly in the workspace.

How can I use MAT without its Web server to visualize or create annotations?

The MAT document visualization and annotation tool is available as a standalone utility which does not require the MATWeb server.

How can I get a handle on a set of annotated documents I haven't seen before?

If you have documents for which you don't have a task, the first step is to ensure that you can read them. If the documents aren't in a known format, you'll need to write a reader for them (or convert them, outside MAT, to a format that MAT knows). Once MAT can read the documents, you can use MATReport to summarize the annotations which are present in the documents, and you can use the --create_task option to create a task, which you can then install. At that point, you can start MATWeb and load the documents into the UI. Alternatively, you can load the documents into the UI while inferring a task.

Core capabilities

How do I choose between file and workspace modes?

You can find a succinct comparison of file and workspace modes here.

Can the default jCarafe engine use lexicons to enhance its performance?

Yes. See the --lexicon_dir option to the jCarafe training engine. But you should be aware of the case sensitivity of feature specifications.

UI and display

Why can't I see my annotations?

If you've defined your annotations in your task.xml file, and you're sure your document contains annotations, but you can't see them in the UI, the most likely cause is that you haven't assigned any display features to them.

Why does my document take so long to render?

If you have lots of annotations (including tokens), the document redisplay can sometimes take a while. We've found that when full display redraw is triggered, documents that have more than 1000 annotations will take a visible amount of time to redisplay. We've attempted to optimize the panel redraw, but full redisplay is still triggered in a number of circumstances, including when documents are loaded or returned from a server-side annotation process, and sometimes with the annotation window width is changed.

What's that URL in the bottom left corner of the UI?

Sometimes, in the UI, in Firefox, you'll notice a URL in the bottom left corner of the UI; if you move your mouse over it, it moves to the bottom right corner. This is an artifact of the particular way that the Yahoo! UI toolkit implements the tabbing that the MAT UI uses. If this URL annoys you, you can almost always make it go away by clicking on one of the tab labels in the UI. We've done our best to eliminate this floating URL, but we haven't been able to work around all of the places it occurs.