Why Do Arabic Translation QA Reports Have So Many False Positives?
Our clients usually wonder why Arabic translation QA reports always have more false positive (FP) errors than any other language, especially Latin languages. As an Arabic translation and localization services provider, we have been through the same scenario over and over again, where clients get back to us with a big panic asking for a review and implementation of long QA reports generated by translation QA tools.
They are usually surprised, and of course relieved, when we send back the reports with all issues, or most of them, marked as false positive issues. We usually provide general comments along with the commented reports stating reasons why these errors are FPs, which is the main reason behind the idea to write this blog.
It is very important that clients, especially ones who are not native speakers of the language, are aware of at least the most common reasons for having many false positive results in Arabic translation QA reports. This saves both the Arabic language service provider and client’s time and effort.
Most Common False Positive Issues in Arabic Translation QA Reports
1. Punctuation Issues – Missing Spaces Before and After Tags/Placeholders
- This happens when there is a tag after the conjunction “and” translated to the Arabic conjunction letter “و”, which should not be followed by a space, but directly attached to the next word.
- It can also happen due to the different structure of the English and Arabic sentences, causing different word/tag order.
2. Missing Numbers
- This happens when numbers are translated into words.
- It also happens with numbers 1 & 2, which are not usually kept in translation in cases like “1 file” and “2 files”, as singular form in Arabic does not require adding number “1”. For example, “ملف” already implies “1 file”. The dual form “ملفان” already implies “2 files”.
- Missing numbers are also reported when Hindi number format is used, while Latin numbers were expected. Using Latin vs. Hindi numbers depends on the client preference.
3. Terminology – Non-Matching Glossary Items
This type of false positive issue is usually caused by any of the following:
- Diacritics being used in either the glossary term translation or the target translation.
- Different encoding for some letters, although they appear exactly the same and are both correct.
- Plural and singular forms used in either the glossary or the target translation.
- Articles being used in one version but not the other, leading to different spellings.
- Gender differentiation in adjectives, leading to variations in spelling.
4. Inconsistency
- You might have two source segments with the same text but different numbers. In Arabic, the translation might not always be identical except for the number.
Example:
“2 files found.” → “ملفين”
“3 files found.” → “ملفات”
“11 files found.” → “ملف” - Different translations for the same source word depending on context. Example: “Open” could be a button name or a status, requiring different translations.
5. Tag Issues
- Tag order change: Arabic sentence structure often differs from English, which can change tag order.
Example:
English: “On {DATE}, {NO_OF_POINTS} points will be added to your credit.”
Arabic: “{NO_OF_POINTS} points will be added to your credit on {DATE}.”
How to Reduce False Positives in Arabic QA
In order to avoid having so many false positive issues in Arabic translation QA checks, Translation QA tools should be fed with such cases. Although feeding QA systems might not be an easy job, we can set rules to cover at least some of the most repetitive cases, like the diacritic handling, number formats, articles, different character encoding, spacing and conjunctions. This is what we are currently doing with our translation QA tool developed internally by Saudisoft.
Of course, there are more false positive issues reported by translation QA tools, some of them specific to client preferences. Our team at Saudisoft will be happy to discuss the above in detail, answer any questions you may have, or offer language consultancy not only for Arabic, but also for several other languages.