Monday, July 11, 2011

Fuzzy matches: the good and the bad.


In writing this post, I am assuming that you use CAT tools, whether gleefully or grudgingly. The polemic on whether they are a boon or a bane can be found at the ProZ.com forums and elsewhere.

One thing I enjoy about translating with a CAT tool is that I can recycle my translations. If there is a segment that is identical or similar to one I had translated before and therefore is in the memory or TM, I can reuse all (100% match) or some of it (fuzzy match). Hence the fuzzy match. (Needless to say this makes me feel all warm and fuzzy inside).

As legal translators, we can benefit from fuzzy matches because there is so much boilerplate text out there, especially in contracts. Many a time, a client will send me a stack of contracts that are very similar with only a few words that change from one contract to another. Here are some to look for:

Names: Often the party signing a contract with the company drawing up the contract will change. Place names in addresses change too.

Numbers: Tax IDs, addresses, dates and sums of money all include numbers that may potentially change.

Dates: Different contracts may have different dates. It is important to check the month, date and year and any other potential changes such as dates written in words.

Rule of thumb for fuzzy match usage: Set your fuzzy match percentage to 80% to avoid fuzzy matches that involve too many changes. When the TM match pane is marked with too many changes or looks too confusing, retype the entire segment. It can be easier this way sometimes. Whenever you are making changes to a fuzzy match, use your finger to check the changes you made against the text marked in red in the TM match pane. It doesn't hurt to check the source segment with the target segment just in case you missed something. In fact, when proofreading your translation before exporting it, you should check the fuzzy match segments (marked with a percentage and in some CAT Tools have a different color) to make sure that you didn't leave anything out or add anything. Nothing is worse than handing in a translation with incorrect information in it!

1 comment:

SEO Translator said...

Good post. However, use fuzzy matches with caution. I had 80% matches in short texts where NONE of the words matched (as the match is calculated based on characters, word length, etc.).

Even 100% matches can be dangerous - I encountered a case where the gender (in Spanish) was changing continuously between the different sentences, even when the original (English) was exactly the same!