Viewing language data in CSV files

Note: the information below was current as of November 2011, when MAT 1.3 was released. It's possible that Excel's import features have improved since then. We provide this information as representative of the problems you may encounter.

The MATScore and the MATReport tools both produce CSV files which contain snippets of your input document. Viewing these CSV files is a bit complicated, and deserves some attention.

The short answer is: use OpenOffice or LibreOffice rather than Excel.

Excel 2007 CSV import has some very unpleasant features which will compromise your ability to view the data cleanly.

Another issue is character encoding. All the CSV documents created by the MAT tools are encoded in UTF-8. In order to view this data correctly on Excel 2007, you must use the import wizard. Again, this option is only available on Windows.

Because there's no consistent way of viewing the data in its clean form, Excel isn't an appropriate tool, especially on the Mac.

Fortunately, OpenOffice and LibreOffice, version 3 or later, does do the right thing. You'll be offered an import wizard when you open a .csv file. Select the column delimiter (comma), and make sure to change the column format to Text for each column which contains spans of text.