Artificial Intelligence Has a Problem with Grammar人工智能遭遇语法问题
2022-01-18莱恩·格林译/傅颖LaneGreene
莱恩·格林 译/傅颖 Lane Greene
The hitch illuminates the nature of language.
這一难题揭示了语言的本质。
If you frequently Google language-related questions, whether out of interest or need, youve probably seen an advertisement for Grammarly, an automated grammar-checker. In ubiquitous YouTube spots Grammarly touts its ability not only to fix mistakes, but to improve style and polish too. Over more than a decade it has sprawled into many applications: it can check emails, phone messages or longer texts composed in Microsoft Word and Google Docs, among other formats.
Does it achieve what it purports to? Sometimes. But sometimes Grammarly doesnt do what it should, and sometimes it even does what it shouldnt. These strengths and failings hint at the essence of language and the peculiarity of human intelligence, as opposed to the artificial sort as it stands today.
Begin with the strengths. In a rough piece of student writing, Johnson counted 14 errors. Grammarly flagged five. For example, it sensibly suggested inserting a hyphen in “post cold war [world]”. It spotted a missing “the” in the phrase “with [the] European economy”. And it noticed an absent “about” in “wondering [about] the state of Europe”. By using Grammarly, the author of this essay could have avoided some red ink.
On the other hand, Grammarly has a problem with false positives, calling out mistakes that are not. The other two suggestions were not disastrous, but neither did they relate to “critical errors” as Grammarly maintains. In the assertion that enlargement had “created a fatigue” within the European Union, Grammarly needlessly suggested deleting the “a”. In another error-ridden sentence it recommended removing a comma, which fixed none of the problems. This false-positive tendency is not a deal-breaker for reasonably skilled writers who just want a second pair of eyes; you can dismiss any suggestion you like. But truly struggling scribblers might not know when Grammarlys ideas would make their prose worse rather than better.
Then there are the false negatives, or the mistakes Grammarly fails to notice. Depending on the text, Grammarly can seem to miss more errors than it marks. The companys chief executive, Brad Hoover, describes it as a “coach, not a crutch”—which sets expectations more appropriately than some of the ads do.
Artificial-intelligence systems like Grammarly are trained with data; for instance, translation software is fed sentences translated by humans. Grammarlys training data involve a large number of standard error-free sentences (so it knows what good English should look like) and human-corrected sentences (so the software can find the patterns of fixes that human editors might make). Developers also manually add certain rules to the patterns Grammarly has taught itself. The software then looks at a users prose: if a string of words seems ungrammatical, it tries to spot how the putative mistake most closely resembles one from its training inputs.
All this shows how far artificial “intelligence” is from the human kind (which Grammarly wants to correct to “humankind”). Computers outpace humans at problems that can be cracked with pure maths, such as chess. Advances in language technology have been impressive in, for example, speech recognition, which involves another sort of statistical guess—whether or not a stretch of sound matches a certain string of words. One Grammarly feature that works fairly well is sentiment analysis. It can rate the tone of an email before you send it, after being trained on texts that have been assessed by humans, for example as “admiring” or “confident”.
But grammar is the real magic of language, binding words into structures, binding those structures into sentences, and doing so in a way that maps onto meaning. And at this crucial structure-meaning interface, machines are no match for humans. Computers can parse (grammatical) sentences fairly well, labelling things like nouns and verb phrases. But they struggle with sentences that are difficult to analyse, precisely because they are ungrammatical—in other words, written by the kind of person who needs Grammarly.
To correct such prose requires knowing what the writer intended. But computers dont work in meaning or intention; they work in formulae. Humans, by contrast, can usually understand even rather mangled syntax, because of the ability to guess the contents of other minds. Grammar-checking computers illustrate not how bad humans are with language, but just how good.
如果你经常上谷歌搜索与语言相关的问题,无论是出于兴趣还是出于需要,你都可能看到过Grammarly的广告,这是一款自动语法检查工具。在漫天的優兔插播广告中,Grammarly宣称它不仅能够纠正错误,还能改进文风,给文章润色。10多年来,它已经打入许多应用程序:它能够检查电子邮件、手机短信,或是以微软Word文档、谷歌文档等其他格式编写的长文本。
那它说到做到了吗?有时候做到了。但有时候Grammarly失职了,有时候它甚至做了不该做的。这些优缺点暗示出语言的本质以及人类智能的特性,而非当今所谓人工智能的特点。
先说Grammarly的优点。在一篇质量不高的学生作文中,《经济学人》的约翰逊语言专栏标出了14处错误。Grammarly则标记了5处。例如,它建议在词组“post cold war [world](后冷战[世界])”中插入连字符,这很合理;它发现,短语“with [the] European economy(欧洲经济)”漏了the;它还注意到,“wondering [about] the state of Europe(对欧洲状况的思考)”少了about。借助Grammarly,这篇文章的作者可以避免一些错误。
而另一方面,Grammarly存在误报问题,它会指出并非错误的错误。Grammarly给出的另外两条建议虽不至于离谱,但也谈不上它所认为的“严重错误”。针对欧盟扩大在内部“created a fatigue(引发了疲劳)”这句话,Grammarly建议删除a,这多此一举。另一个满是错误的句子则被建议删除逗号,可这并未解决任何问题。对那些只想多一双眼睛检查的写作高手来说,这种频现的误报并不会坏事:你可以忽略想忽略的任何建议。但那些绞尽脑汁、水平不高的作者可能无法判断,在什么情况下Grammarly的建议会帮倒忙。
此外,Grammarly还存在漏报问题,即无法发现某些错误。Grammarly漏掉的错误可能比标记出来的还要多,视文本内容而定。该公司首席执行官布拉德·胡佛将Grammarly形容为“教练,而非拐杖”。相较一些广告,这个比方更为恰当地设定了此款软件该符合的期望。
像Grammarly这样的人工智能系统是用数据训练的。例如,翻译软件的训练数据是人工翻译的句子。Grammarly的训练数据包括大量标准无误的句子(所以它知道好的英语应该是什么样子)和人工纠正的句子(所以它能发觉人工编辑可能采取的改错模式)。开发人员还将某些规则手动添加到Grammarly的自学修改模式中。这样,当该软件检查用户文章时,如果一串单词看起来不合语法,它便会试图找出假定的错误与训练输入的错误最相似的地方。
所有这些表明,人工“智能”和人的智能[即human kind,Grammarly会把这个词组改为“humankind(人类)”]相去甚远。计算机在下国际象棋等纯数学问题上比人厉害。它在语言技术方面的进步也令人赞叹,比如语音识别,这涉及另一种统计猜测,即一段声音与某串单词是否匹配。Grammarly具备一项很棒的功能:情绪分析。它可以在电子邮件发送之前对其语气进行评估。它接受过训练,见识过哪些文本被人类评定为“赞赏的”或“自信的”等等。
然而,语言真正的神奇之处在于语法,它将单词绑定到结构中,将这些结构绑定到句子中,使之表情达意。结构与意义之间的交互至关重要,在这点上,机器无法与人类相比。尽管计算机能很好地(从语法上)解析句子,标出诸如名词和动词短语等句子成分,但面对难以分析的句子,计算机束手无策,这恰恰是因为这些句子不符合语法,换句话说,写出这些句子的正是需要Grammarly的人。
要修改这类文本,就要知道作者的意图。但是,计算机无法理解意义或意图,它们靠的是公式。相比之下,人类因为有能力猜测别人的想法,所以通常能够理解十分混乱的句法。用计算机检查语法,并不能说明人类处理语言的能力有多么糟糕,相反,这只能说明人类的语言能力十分出色。
(译者为“《英语世界》杯”翻译大赛获奖者)