APP下载

AnAnalysisonComputer—AssistedTranslationthroughGoogleTranslator Toolkit

2018-05-04张嵩松

校园英语·上旬 2018年1期
关键词:三星集团飞利浦簡介

【Abstract】With the burgeoning of artificial intelligence, computer-assisted translation (CAT) has become more popular than ever before. This paper describes problems and pitfalls encountered within the process of translating on the basis of Google Translator Toolkit platform. In addition, it provides some solutions for those problems and pitfalls as well as some suggestions to how to improve CAT system.

【Key words】Analysis; Computer-Assisted; Translation

【作者簡介】张嵩松,贵州省农业对外经济合作中心。

1. General Information

The paper makes use of Google Translator Toolkit platform to help translate some legal documents and regulations form English to Chinese. It is universally acknowledged that legal documents and regulations in different industries or fields are quietly similar in their grammatical structure, sentence structure so that machine translation are perfect to aid in translating in these documents (either form English to Chinese or Chinese to English) if an appropriate size of translation memory and glossary is available. Therefore, the paper collects some bilingual legal documents and regulations (Chinese and English) from some official website of famous companies such as Samsung, Apple and Philips. The collected text is produced into translation memory, which the number of English characters is 2, 848 and Chinese is 4.068. What is more, it also selects some frequently used words or terms to make a glossary.

2. Translation Memory

In order to achieve the goal of the paper which assists translating in legal documents and regulations, the translation memory of the CAT containing 100 translation units, mainly are collected from some legal documents and regulations. Moreover, the segment size is sentence in each translation unite. And the translation quality of each segment is very professional because all translation units in these documents are collected from official translation. On the other hand, the English text and Chinese text are fully matched. There provide some examples form our translation memory to illustrate.

(1)Eng:All information, documents, products and services, trademarks, logos, graphics, and images (“Materials”) provided on this site are copyrighted or trademarked and are the property of Samsung Group, Samsung Electronics and its listed subsidiaries.

CHN:本网中所提供的所有信息、文件、产品, 以及服务、商标、logo、图形,以及图片(以上涉及内容以下简称为“资料”)都是具有版权或已经注册的商标,是三星集团、三星电子及其子公司的财产。

Example 1, the segment size is a sentence from the official website of Samsung, and its quality of translation is very faithful. In addition, the English text and Chinese text is nearly perfect matched.

(2)ENG:In addition, you may not distribute an End User Product the purpose of which is to replay the courseware, presentations, interactive multimedia material, interactive entertainment products and the like of others.

CHN:另外, 您不得为播放课件、演示文稿、交互式多媒体资料、交互式娱乐产品等目的而分发“最终用户产品”。

Example 2 shows that the Chinese sentence structure (target language) is different from English structure because there is a clause in the English sentence (source language). However, the quality of translation is very professional even some different in grammatical structure.

The data in the translation memory is from official website of some first-class enterprises. Therefore, the source language (English) and target language (Chinese) are accurate and the quality of translation can be guaranteed.

3. Glossary

It is well-known that there many terminology and set translation in legal and regulation translation. Hence, colleting some frequently used words and terms in English and Chinese are needed to assist machine translating in the legal documents.

The glossary contains 104 frequently used words and terms, which are selected from the translation memory. The quality of translation in the glossary is also professional and reliable because the source of the glossary is the same as that of the translation memory. The paper would like to give some examples as following:

copyright laws—版權法, trademark law—商标法, laws of privacy and communications statutes—通信条例, patent laws—专利法. These terms are all name of law, so their translations are fixed. Therefore, collecting these words can improve the efficiency and accuracy of the translating in the legal documents and regulations.

4. Evaluation of the System

The purpose of this evaluation is to assess the translation result of the CAT system. It is clear that the performance of the CAT system is affected by the data collected in translation memory.

For a test of the performance of the CAT system on legal document, an appropriate evaluation legal textual content is needed. The chosen sample for this evaluation is extracted from the official website of Philips (http://www.philips.com.cn/), including 25 translation units and total 801 English words. Logging in Google Translator Toolkit then uploading the translation memory and glossary respectively which are multilingual texts of a very high quality in both Chinese and English language translated by the professionals. And then upload the sample text to the Google Translator Toolkit to get the machine translation result and compare the MT result with professional translation version.

The quality of MT output is not ideal. Only15% of the pre-translation is from human translation mainly from the glossary uploaded, and 85% (709 words) are from machine translation, in addition, there are only eight words in TM 100% matches, no in context matches, ‘High fuzzy TM matches, Repeated text. If texts with unambiguous vocabularies, easy sentence structures and grammar often lead to understandable translation rendered by the machine, allowing readers to understand the general idea of the source language. Nevertheless, texts with terminology, long and complex sentence structure and different punctuation can cause the text to be translated wrongly. The paper set “sentence” as the translation unit, so the translation result is not good enough. However, if the paper set “word” as the translation unit, the translation quality might be largely improved.

In the paper, none of pre-translation form professional translation can be found because the translation memory is not big enough. And the paper selects the following good translations:

(1)“Philips is a registered trademark of Philips electronics.” is the source text and the machine translation result is “飛利浦是飞利浦电子公司的注册商标。”The machine translation result is pretty advanced not only in words but also in the sentence structure which can be easily accepted by the readers.

(2)“Please contact your local Philips business contact for further information.” is translated to”请联系您当地的飞利浦进一步信息的业务联系。” Although the translation result seems unnatural, the meaning of the source text cannot be confused.

Actually, these suggested good translations in the Google Translator Toolkit are not good enough. Reasons will be greatly confirmed by the following discussion:

Any perfect match or even fussy match in the global shared translation memory and uploaded translation memory cannot be found because the segment size is sentence and the genre of the sample is legal documents and regulations in which repletion rate is nearly zero. In addition, the size of the translation memory is not big enough to find some previous translations. In addition, Legal documents and regulations involve many grammatically complex and extraordinarily long sentence, a slew of terminologies so that the Google Translator Toolkit are considered unsuitable for legal translation if they are not equipped with adequate legal translation memory and glossary. Some short sentence with few terms or unambiguous words can be properly translated by the toolkit and the quality of translation is acceptable and readable. However, if the sentence is too long and complex, the translator toolkit cannot be translated properly even its quality of translation is not readable. For example:

(1)The source language “In such case, such exclusions or limitations shall be limited to the greatest extent permitted by applicable law.” The machine translation is “這种情况下,这种排除或限制,应仅限于适用法律所允许的最大程度。” and the suggested translation is “在此情况下,此类例外或限制仅限于适用法律所要求的范围。” In this case, the machine translation is readable but uncompressible, which makes any sense in Chinese and meaning is far from the source language.

(2)The source language is “Philips is in no way responsible for the content of any site owned by a third party that may be linked to the web site via hyperlink, whether or not such hyperlink is provided by the web site or by a third party in accordance with the terms of use.” The machine translation is “飞利浦是绝不可能通过链接的网站链接到由第三方拥有的任何网站的内容负责,不论这种超链接网站或由第三方提供,按照条款使用。” and the suggested translation is “飞利浦对通过超链接连接到本网站的任何第三方所属站点的内容概不负责,无论此类超链接是由本网站还是由第三方根据使用条款提供。” In the example, the structure and grammar of the sentence is more complex than that of example one and machine translation is a simply word-to-word translation which can not acceptable and readable at all in Chinese.

Therefore, the quality of machine translation is constrained by that of the source text input. A text with proper grammar and unambiguous wording often leads to unreliable translation by the machine, especially in a text with slang, misspelled or ambiguous words and complex or lengthy sentences can easily cause the text to be translated incorrectly.

5. Conclusion

Legal documents and regulation texts are conceived to be suitable for machine translation in regard to its use of standard and formal grammar and its non-ambiguous language style and set words and terminology. However, legal translation requires the highest translation quality. The translation quality in the paper is unsatisfactory due to the following three reasons:1. inadequate translation memory and glossary, 2. inappropriate segment size in the translation memory, 3.the limitations in the Google Translator Toolkit.

Therefore, improving CAT system should firstly reset the segment size from sentence to word so that can help machine translation become more accurate and precise, and enhance translation memory by adding more terminologies and words which can offer more references for the machine. The Google Translator Toolkit cannot translate long and complex sentences properly, however, the toolkit does help improve translation efficiency and its sharing system is quietly useful and convenient for translators.

猜你喜欢

三星集团飞利浦簡介
Research on Guidance Mechanism of Public Opinion in Colleges and Universities in Micro Era
Book review on “Educating Elites”
Hometown
书香醉人
飞利浦开启全新“汽车生活”
Armin van Buuren联手飞利浦打造便携式DJ打碟一体机