Many companies make use of machine translation to realize various translation needs. This is one of the reasons why SAP provides machine translation focused on SAP and enterprise content via the Document Translation Service, one of the services available in SAP Business Technology Platform.
The service is already being used in many different business scenarios across SAP, maybe you have come across one or the other. However, machine translation technology continuously evolves and is still a busy field of research. To maintain a high-quality machine translation offering for different purposes within businesses that want to provide multilingual content, the machine translation system needs to keep up with current technology. Over the past years, we have seen an increasing demand for translation of conversational content in a business context, for example support chat dialogues, as highlighted in this blog by Janos Nagy.
The challenge in handling conversational content is to allow for incorrect input while maintaining translation quality. Conversational content often includes lower quality source input with typos, missing or inconsistent casing, lack of punctuation, informal words, etc.
Here you see some exemplary input
why my gdm is not working
it says .service file is not there
what are you doing exactly
where is it exactly
In the business context, conversational content is quite far removed from the oftentimes curated content such as learning or marketing materials. Looking to provide improved quality for this type of content, adding more previously translated data alone is not the answer. Partly, because the amount of translated conversational content needed would exceed available resources and in parts, because the different nature of conversational content could influence the translations of formal content, which should remain formal. There is ongoing research in the machine translation research community on how to tackle such input and improve machine translation output on such data.
The machine translation team at Language Experience investigated several methods from current research over several months. Following a research-driven approach, we were able to judge the impact of specific methods in our systems and incorporate the most promising techniques to improve quality for conversational content. Meanwhile, there were no degradations for other content types, which is of paramount importance for our offering. We are happy to report that a publication detailing our findings and experiences was accepted at the EAMT conference 2022, an established conference in the machine translation area (https://eamt.org/). Our paper was peer-reviewed to be accepted and a member of our team was able to attend the conference, present our paper and mingle with the research community.