Tutorial 6: The experiment engine

At any point, you might have a number of questions about the state of your corpus.

  1. You may want to know simply how well your training and tagging strategies are doing.
  2. You may want to know how the tagging improves as you increase the number of training documents.
  3. You may want to know how models trained on one corpus of documents perform against a different document corpus.
  4. You may want to know how differently constructed models perform against the same corpus.
  5. You may want to know how different tagging strategies affect the output.

These and other questions can be answered easily by the experiment engine, MATExperimentEngine. The power of the experiment engine lies largely in its rich XML configuration. In this tutorial, we'll learn how to use the experiment engine to answer one of the questions above, and you can examine the other documentation to see how you might answer other questions, as illustrated in the use cases.

We're going to use the same simple 'Named Entity' task, and we're going to assume that your task is installed. This tutorial involves both the UI and the command line. Because this tutorial involves the command line, make sure you're familiar with the "Conventions" section in your platform-specific instructions in the "Getting Started" section of the documentation.

Step 1: Review your XML file for question 1

This step is fairly easy, because the XML file to answer the first question is included as part of the distribution. The XML file is found in MAT_PKG_HOME/sample/ne/test/exp/exp.xml, and it looks like this:

<experiment task='Named Entity'>
<corpora dir="corpora">
<partition name="train" fraction=".8"/>
<partition name="test" fraction=".2"/>
<corpus name="ne">
<pattern>*.json</pattern>
</corpus>
</corpora>
<model_sets dir="model_sets">
<model_set name="ne_model">
<training_corpus corpus="ne" partition="train"/>
</model_set>
</model_sets>
<runs dir="runs">
<run_settings>
<args steps="zone,tokenize,tag" workflow="Demo"/>
</run_settings>
<run name="test_run" model="ne_model">
<test_corpus corpus="ne" partition="test"/>
</run>
</runs>
</experiment>

This is one of the simplest complete experiment XML files you can create. As with all experiment XML files, it describes three types of entities.

In most cases, your corpora should consist exclusively of annotated documents which have been marked gold.

So this experiment takes a single set of documents, and designates 80% of the set for training and the remaining 20% for test. It then generates a single model from the training documents, and executes a single run using this model against the test documents.

Step 2: Run the experiment

This operation is a command-line operation. Try it:

Unix:

% cd $MAT_PKG_HOME
% bin/MATExperimentEngine --exp_dir /tmp/exp \
--pattern_dir $PWD/sample/ne/resources/data/json sample/ne/test/exp/exp.xml

Windows native:

> cd %MAT_PKG_HOME%
> bin\MATExperimentEngine.cmd --exp_dir %TMP%\exp \
--pattern_dir %CD%\sample\ne\resources\data\json sample\ne\test\exp\exp.xml

The --exp_dir is the directory where the corpora, models and runs will be computed (and stored, if necessary), and where the results will be found. The --pattern_dir is the directory in which to look for the files referred to in the <pattern> elements in the experiment XML file; the patterns are so-called Unix "glob" patterns, which are standard file patterns which should be familiar to any user of the Unix shell. The final argument is the experiment XML file itself.

The engine will create the directory, copy the experiment XML file into it for archive purposes, and then run the experiment as described in step 1.

Step 3: Review the results

Look in the experiment directory.

Unix:

% ls /tmp/exp

Windows native:

> dir %TMP%/exp

allbytag_excel.csv corpora model_sets
allbytoken_excel.csv exp.xml runs

The corpora, model_sets and runs subdirectories are as specified in the experiment XML file above (that's what the "dir" attribute does). What you'll be most interested in are the files allbytag_excel.csv and allbytoken_excel.csv. These files contain the tag-level and token-level scoring results (including Excel-style formulas) for all the runs. The format and interpretation of these results is found in the documentation for the scoring output, except that the initial columns are different; you can find a description of the differences in the documentation for MATExperimentEngine.

Under /tmp/exp/runs, you'll see a directory for each named run (in this case, only "test_run"), and below that, a directory for the name of the model configuration ("ne_model" in this case):

Unix:

% ls /tmp/exp/runs/test_run/ne_model

Windows native:

> dir %TMP%\exp\runs\test_run\ne_model

_done bytoken_excel.csv hyp
bytag_excel.csv details.csv run_input

The important elements here are the individual scoring files bytag_excel.csv and bytoken_excel.csv, which are (approximately) the subset of the corresponding overall scoring files which is relevant to this run. Of greater interest is details.csv, which is the detail spreadsheet for this run. These detail spreadsheets are not aggregated at the top level because they contain an entry for each tag, and the volume of data would likely be too great.

For more details about the structure of the experiment output directory, see here. For detailed examples for the other questions posed above, see the experiment XML documentation.

Step 4: Run an experiment against a workspace

We can treat a workspace, or a portion of a workspace, as a corpus for the experiment engine. If you've done Tutorial 5, and you kept your workspace around, you can run a simple experiment against that workspace using MATWorkspaceEngine:

Unix:

% cd $MAT_PKG_HOME
% bin/MATWorkspaceEngine /tmp/ne_workspace run_experiment \
--test_basename_patterns 'voa3,voa4'--test_step tag

Windows native:

> cd %MAT_PKG_HOME%
> bin\MATWorkspaceEngine.cmd %TMP%\ne_workspace run_experiment \
--test_basename_patterns "voa3,voa4" --test_step tag

You can specify an experiment XML file to use, but in this simple example, we're not using one; we're treating all the gold documents as the training corpus, except those which match the --test_basename_patterns (which will be used as the test corpus).

The experiment results will be in the workspace directory in experiments/<date>.

Step 5: Clean up (optional)

Remove your experiment directories:
Unix:

% rm -rf /tmp/exp

Windows native:

> rd /s /q %TMP%\exp
If you're not planning on doing any other tutorials, remove the workspace:
Unix:

% rm -rf /tmp/ne_workspace

Windows native:

> rd /s /q %TMP%\ne_workspace

If you don't want the "Named Entity" task hanging around, remove it as shown in the final step of Tutorial 1.

This concludes Tutorial 6.