Translation Error Rate

Translation Error Rate is an error metric for machine translation that messures the number of edits required to change a system output into one of the references. NOTE: HTER (TER with human targeted references) requires post-editing of system output, software for which is not provided here.

TER-Plus (terp) was our submission to the NIST Metric MATR 2008 workshop. While it can be used to compute TER, it is a more powerful tool that was designed to correlate with human judgments. Software and documentation for that metric can be found at: http://www.umiacs.umd.edu/~snover/terp.

Please read the AMTA 2006 paper on TER for more details on the error metric.

The tercom software is copyright 2005-2006 by BBN Technologies and the University of Maryland. It is freely distributed under the attached license.

Downloads

TER COMpute Java code (contains tercom.jar and sample data)
DateVersionDownloadChanges
8/19/08 Version 0.7.25
(current version)
[ tercom-0.7.25.tgz ] Tokenization changed so that 's is correctly tokenized when using -N flag. No other changes from version 0.7.2.
3/26/08 Version 0.7.2[ tercom-0.7.2.tgz ] Several bug fixes. Scores might vary slightly from version 0.7.0.
3/27/07 Version 0.7 [ tercom-0.7.tgz ] First full Java release. Several bug fixes.
4/28/06 Basic Java Classes[ java_ter_060428.tgz ] Bug fix, and an increase in the beam width (to 20) for minimum edit distance.
Contains some basic Java classes which can be used to calculate TER. It is much faster than the perl script but the UI is not very developed. It is useful if you wish to examine the algorithm to more fully understand how it works. This code is presented here for research and educational purposes. It was designed to be integrated into NIST's HTER annotation tool. A test wrapper for the code is given, but it is not as functional as the tercom script.
The full version contains this code, plus bug fixes and additional functionality.

TER COMpute Perl code (contains tercom.pl and sample data)
The Perl versions are no longer supported. Please use the java version (it is much faster as well).
DateVersionDownloadChanges
4/10/06 Version 6b [ tercom_v6b.tgz ] Added document level scoring (.sum_doc output)
Added support for n-best list scoring (.sum_nbest output)
Added option -N to normalize hyp and ref as in MT eval scoring
Increased precision of floating point output
11/10/05 Version 5 [ tercom5.tgz ] Changed program name from calc_ter to tercom
Changed default options (-f 2 is now default)
Added support for sgm files (-i sgm)
Added option to specify beam width

AMTA 2006 Paper on TER [ pdf download] [ ppt slides ] [ pdf slides ]
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul, "A Study of Translation Edit Rate with Targeted Human Annotation," Proceedings of Association for Machine Translation in the Americas, 2006.

TER Technical Report [ pdf download ]
Snover, Matthew, Bonnie J. Dorr, Richard Schwartz, John Makhoul, Linnea Micciulla, Ralph Weischedel, "A Study of Translation Error Rate with Targeted Human Annotation," LAMP-TR-126, CS-TR-4755, UMIACS-TR-2005-58, University of Maryland, College Park, MD July, 2005.

References

References to TER should cite:

Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul, "A Study of Translation Edit Rate with Targeted Human Annotation," Proceedings of Association for Machine Translation in the Americas, 2006.