公式化语言研究必须遵循的六大原则:John Sinclair给我们的启示
2014-03-20肖福寿
肖福寿
(上海大学 外国语学院,上海 200444)
1.引言
长期以来,公式化语言(formulaic language)研究一直是应用语言学界的研究焦点之一。自20世纪90年代开始,公式化语言研究向语料库驱动下的固定搭配研究转向。(Sinclair,1991;Biber et al.,1999,2007,2009;Cortes,2004,2006)通过分析大量的口语/笔语语料,语言学家运用基于频率的方法发现了大量语言单位是共现的这一语言现象,深入考察了语言的规律。
在国内,基于语料库的研究主要集中在不同语料库的比较(尤其是本族语者与中国学习者语料库的比较),借此探讨中国大学生的公式化语言的习得和使用情况(濮建忠,2003;丁言仁、戚焱,2005;王立非、张岩,2006;郑超、袁石红,2011)。应该说,国内的研究起步较晚,而且研究者对于语料库驱动下的公式化语言研究的认识不一,得出的结论有时相互矛盾。
鉴于此,本文通过深入分析语料库语言学的鼻祖John Sinclair(1991)对公式化语言研究的贡献,提出了语料库驱动下的公式化语言研究必须遵循的六大原则。这些原则的提出,对于促进语料库研究,尤其是公式化语言研究,具有一定的指导和借鉴意义。
2.注重语料新颖的原则
Sinclair(1991)认为,要发现人们实际使用语言的真相,就必须观察人们实际使用的语言。因此,语言学研究所使用的语料必须是真实的语言,即大量的自然发生的语料,而不是通过内省(introspection)和直觉(intuition)获得的语料。他写道(1991:4):“...the contrast exposed between the impressions of language detail noted by people,and the evidence compiled objectively from texts is huge and systematic.It leads one to suppose that human intuition about language is highly specific,and not at all a good guide to what actually happens when the same people actually use the language.”
根据 Sinclair的语料观,我们可以得到以下启示:
(1)选择某种语料要根据这些文本扮演的某种社会角色,而不是根据这些语料是否可以说明某个语言点。遗憾的是,目前有不少语法学家或其他语言学家,他们选择语料,其目的是为了验证某一语言现象。换言之,如果他们发觉某种语言现象非常有趣,就会选择围绕该现象的各种用法来分析。这是不可取的,因为如果我们只专注于英语中异常的东西,就有可能忽略一些更为常规的、单调的语言型式。
(2)研究的语料量要大。语料库越大,就越能精确地描述经常出现的词项。语料越多,对于核心表达的认识越会改变。原来重要的东西,经过语料库的筛选可能变得不太重要。大型语料库可以发现核心而典型的东西,可以区别典型的与非常见的用法,区别典型与可能的用法。那么,普通的语料库至少该多大呢?一般来说,至少包含一百万字。根据Leech(1991)的发现,最早期的语料库大概包含了约一百万字,远远超出了语言学家的实际使用量。Sinclair(1991)在Corpus,Concordance,Collocations一书中描述的语料库包含约七百多万字,而1997年的Bank of English语料库则包含了3亿多字。由此可见,语料库越大,我们就越容易发现人们的语言使用规律。
(3)观察大量的语料,可以从各个角度分析语言的方方面面,其中频率(frequency)对于研究语言至关重要。没有频率方面的信息,就无法研究语言。频率研究发现,一些词串出奇地经常共现,即使所谓的固定表达法也表现了出奇的可变度。
(4)使用的语料必须经过系统性排列。语料越多,越需要进行组织。如果没有系统组织的话,要找出词语搭配的频率是困难的。以词形(wordform)为单位设计出的软件,可以帮助我们查找到某一词形的所有例子,也可以同时呈现那些出现在该词形前后左右的一些词语,对这些句子进行字母顺序排列,发现其中的型式。正如Sinclair(1991:4)所说:“...the ability to examine large text corpora in a systematic manner allows access to a quality of evidence that has been available before.”
(5)采用“问题导向”(problem-oriented)的研究方法。这种基于语料来解决问题的研究被Tognini-Bonelli(2001)称之为“基于语料”(corpus-based)的研究,与“语料驱动”(corpus-driven)形成对比。
(6)对语料库进行加注,使软件可以查阅到某一范畴(如被动语态、不定式从句、补足语),而不是某一词形。比如,Biber等 (1994)计算了“that-”和“wh-”引导的从句的使用频率,Halliday(1993)通过大型语料库计算了肯定与否定从句的频率,Kettermann(1997)用加注语料库回答语言习得的相关问题。
3.注重意义单位描述的原则
Sinclair(1991)认为,有些词出现在短语中,其意义会发生变化。比如:“have a baby”(生小孩)、“have a bath”(洗澡)、“have a cigarette”(抽一支香烟)、“have such conduct”(容忍这种行为)、“have a meal”(用餐)、“have a severe headache”(头疼得厉害)、“have a walk”(散步)中的“have”是一个频繁使用的动词,但在这个词组中则失去了原来的多数意义,意义不是限于这个词,而是扩展到整个词组。这种现象叫做“渐进的去词汇化”(progressive delexicalization)。
根据Sinclair的“意义单位描述”观,我们可以得到以下启示:
(1)在描述语言单位过程中,必须充分考虑到受限制的语境;
(2)在研究公式化语言中,注重同一语块在不同上下文中发生的意义变化。
就拿“naked eye”(肉眼)来说。British National Corpus出现了148个含有“naked eye”的例子。通过分析这些例子,我们可以看出“naked eye”通常所处的语境不是固定的,而是受到限制的,具体如下:
语境1“naked eye”与“the”共现,如:We merely became accustomed to the general life of the common birds and animals,and to the appearances of trees and clouds and everything upon the surface that showed itself to the naked eye.
语境 2“the naked eye”与“to”共现,如:The legs are flailing wildly—tiny stretches of insect flesh—no thicker than a hair to my naked eye,but obviously larger than life to this poor,wretched creature...
语境3“the naked eye”与“with”共现,如:The interesting point is that the Greeks were certainly able to see Merope with the naked eye,whereas today this is virtually impossible.
语境 4“the naked eye”与“by”共现,如:It would have been no use asking him whether he thought there was a unifying purpose in life,whether it could really be chance that an animal so small that it couldn’t be seen by the naked eye could die millions of years ago in the depths of the sea and be resurrected by science to prove a man innocent or guilty.
语境5“the naked eye”与“via”共现,如:It is known more usually under the name Gill-maggot,because of the length and shape of the female’s egg-sacs which look like miniature white maggots when viewed via the naked eye.
语境6“the naked eye”与“visible”共现,如:The mite is just visible to the naked eye and feeds on honey bees and their grubs by sucking their body fluids.
语境7“the naked eye”与“invisible”共现,如:Through his telescope Galileo observed more things in the heavens than had ever been dreamed of:moons of Jupiter and myriads of stars invisible to the naked eye.
语境8“the naked eye”与“obvious”共现,如:The Small Cloud is very obvious with the naked eye,and binoculars show it well,though admittedly it cannot rival the splendour of the Large Cloud;it has no well-defined shape,but is easy to resolve,at least in part.
语境9“the naked eye”与“separable”共现,如:These pairs are separable with the naked eye,but closer binaries—or,of course,optical doubles—require binoculars or a telescope.
语境10“the naked eye”与“make out”共现,如:The body louse may lay its eggs in clothing or bedding,while the head louse,like the crab louse,cements its eggs on to hairs forming‘nits’,which are the size of a pin-head and can just be made out with the naked eye.
语境11“the naked eye”与“see”共现,如:The whiskers were too small to see with the naked eye and nobody could possibly make a testing machine on that scale.
语境12“the naked eye”与“split”共现,如:I have never been confident that I can split them with the naked eye,but 7 × 50 binoculars make it easy enough
语境 13“the naked eye”与“beat”共现,如:Don’t forget,if ever you’re in doubt about what processor you’ve actually got,it’s hard to beat the naked eye!
本文发现,“naked eye”在BNC中出现的语境多达近30种。这些语境表明,“naked eye”作为一个单位出现,但具体意义是通过其他与之搭配的词语来实现。这个单位不是句法单位,也不是“固定词组”,可与不同的词语共现,形成“意义单位”(meaning unit)。如果根据习语原则来分析语言,意义单位将是主要的分析单位。
4.注重搭配框架的原则
Sinclair(1991)和 Sinclair& Renouf(1991)认为,短语或意义单位不是集中在具有词汇意义的词语上,而是最经常出现在语法词(如:“of”、“the”、“be”)之间,并通过语法词构成“搭配框架”(collocational frameworks),如“too+?+to”、“a/n+?+of”、“be+?+to”、“for+?+of”、“had+?+of”等等。
Sinclair和Renouf(1991)发现,搭配框架在语料库中的词语搭配中占了很大比重,某个词(如“series”)对于某个框架(如:“a+?+of”)是重要的,而这个框架对于该词来说同样是重要的。例如,“a series of”是“a+?+of”这个框架中第七大最为频繁出现的搭配。在所有“series”的搭配中,“a series of”达到57%,而在所有搭配词中,“series”在语料库的出现频率仅为17位。不仅如此,出现在框架里的词语不是随意选择的,而是根据某种归类或范畴搭配的。比如,出现在“an+?+of”框架中的名词可以归纳为以下类别(Sinclair&Renouf,1991:136-137):
(1)测量与量词 (如:army,average,inch,ounce);
(2)表示事物的部分(如:edge,end,evening,hour,part);
(3)表明一种属性(如:array,index);
(4)支持“of”后面的名词(如:act,example,expression,inkling,object);
(5)表示一项活动(如:extension,explanation,invasion,upsurge);
(6)表示一种素质或情形(如:absence,awareness);
(7)表示一种关系(如:enemy,officer)。
根据Sinclair(1991)的“搭配框架”观,我们可以得到以下启示:
(1)描写语言中的相同现象可有许多不同方法。一方面,语言使用者可以通过不同角度获取描述语言的方法;另一方面,任何一种描述都不完整,必须不断寻找新的视角。就拿Sinclair&Renouf(1991)对“an examination of”的描述来说。虽然“examination”经常出现在“an+?+of”这个框架中,但这并不能给我们提供有关这个词的完整信息。在1997版的Bank of English中,“examination”出现了7327次,但只有408次出现在“an+?+of”框架中;“examination of”出现了2031 次,大约有400次是与其他限定词(如:his/its/the examination of)共现,另外有400次是框架的“an”后面带有形容词(如:a detailed examination of)。由此可见,“examination”的核心用法是后面紧跟“of”引导的介词短语,前面的限定词通常是“a”或“an”,也可以是别的修饰词。
(2)借助搭配框架,可以展示实际语言使用的次数,而且可以获得描述语言的新视角,这样就可以不用抽象语言范畴的传统方法。其实,学习者更感兴趣的不是“an+?+of”这个框架,而是有关“examination”的全部信息。根据British National Corpus,“an+examination+of”出现了 384 次,其中至少包含了以下信息:
信息1“an examination of+N”作为主语,如:None the less an examination of the special reasons why the Plowden proposition is accepted as a truism in general but treated as an abominable heresy in particular may be worthwhile.
信息2“an examination of+N”作为介词词组,如:This in turn will lead us to an examination of how corporate law scholars have sought to offer new ways of legitimating corporate managerial power and how these too prove to be unequal to the task.
信息3“an examination of+N”作为并列成分,如:Bearing Philo’s words in mind—in particular his characterization of the male as active/causal and the female as passive—we shall now turn to the main focus of this essay,that is,an examination of the rituals of circumcision and menstrual taboo.
信息4“an examination of+N”作为宾语,跟在动词后面,如:North Korea responded by emphasising the extent of its existing co-operation with the IAEA,and demanding an early inspection of South Korea,and in particular an examination of the nuclear capability of the US forces stationed south of the 38th parallel.
信息5“an examination of+N”作为宾语,跟在动词词组后面,如:The analysis begins with an examination of turn-length,turn-taking and topic-shift before applying pragmatic theories such as Grice’s cooperative Principle,Brown and Levinson’s Politeness Phenomenon and Leech’s Politeness Principle.
信息6“an examination of+N”作为表语,如:Had I had the receiver in my hand when some break in the conversation occurred at this point,I should have explained to you that it is in fact neither;it is merely an examination of the various modes of thinking which the phrase implies—an examination which,in the tradition of British philosophical inquiry,seeks merely to study and perhaps oil the conceptual machinery and then to put it back more or less as it was.
(3)处理语料库可以有两套不同法则:一是完全依靠共现的频率,借助于计算机软件;二是更有解释性的,要求研究者的输入。这样,我们可以避免传统方法的弊端,即人工查询语料库、逐个找出各个词项用法。传统的方法既耗时,又难以处理大型语料库,导致观察出现偏差。
5.注重意义与结构交融原则
Sinclair(1991)认为,词语搭配的意义(sense)与结构(structure)是相互联系的。所谓“结构”,就是指一个词语及其相关的型式和搭配。有的结构属于词汇性的,如:“stand a chance”(很有机会)、“stand the test of time”(经得起时间考验)、“stand treat”(请客)、“stand comparison with sb.”(能与某人相比)、“stand sb.a meal”(请某人吃一顿饭)、“stand idle”(闲置着)、“stand ready”(随时待用);有的结构属于语法性的,如:“stand behind”(站在……后面;支持)、“stand by”(袖手旁观;支持)、“stand clear of”(站开,避开)、“stand for”(代表;主张)、“stand in”(替代,作替身)、“stand out”(突出;杰出)、“stand up for”(维护;支持)、“stand up to”(勇敢面对;经得起)。Sinclair(1991:65)解释说:“It seems that there is a strong tendency for sense and syntax to be associated.”
根据 Sinclair的形义观,我们可以得到以下启示:
(1)观察一个词的“结构”,可以帮助人们区别多义词的不同意思。如果一个词是多义的,那么这个词就会以几种型式来使用,其中某种意思在某个型式中的出现频率要高得多。因此,例子中的型式会显示该词的最有可能的意思。比如,“yield”一词的意思就可以出现在以下型式中:
Pattern 1 YIELD+N+Prep+N,如:As 60 per cent of the cassava grown in this area is marketed in towns,a yield increase even of this order of magnitude has had a positive impact on urban food supplies.
Pattern 2 YIELD+N,如:Conventional radiocarbon dating normally requires sample sizes which will yield a minimum of 1g of carbon.
Pattern 3 V+YIELD,如:Early herbicide treatments provide the best control,because older grass weed seedlings are more difficult to kill and also compromise yield.
Pattern 4 N+YIELD,如:Kloof,benefiting from a higher gold yield,lifted net profits R4.77m to R97.7m,despite production losses over the December-January holiday period and a fall in the tonnage of ore milled.
Pattern 5 YIELD+N,如:But since their dividend growth should be ahead of the market’s,the yield premium should be slimmer.
Pattern 6 YIELD单独使用,如:This leads to a violent tussle between them with Betty refusing to yield.
(以上例子选自British National Corpus)
(2)意义与结构的联系不是一对一的。一个单词的一种意思不是只能在一种型式中出现,一个型式不是只能用于一个单词的一种意思。如果是一对一的联系,就不可能出现歧义现象。事实上,歧义是可能的,许多笑话靠的就是词的歧义。在一般交往中,歧义很少见,因为结构足以区分意义。
就拿“stand”来说。如果脱离上下文或者只开个玩笑,“Is this your stand?”是有歧义的,至少有以下6种意思:
(a)这是他的观点吗?
(b)这是他的立场吗?
(c)这是他的态度吗?
(d)这是他的摊位吗?
(e)这是他的架子吗?
(f)这是你的演出海报吗?
不过,在正常的语言交际中,这种歧义很少发生,例如,以下句子中的“stand”的意义是显而易见的:
(1)I can’t stand the sight of her.我很看不惯她。
(2)Business is at a stand.生意萧条。
(3)A roar of applause erupted from the stands.看台上爆发出一阵喝彩。
(4)The witness was put on the stand.证人被传到证人席上。
(5)His hat was put on the hat stand.他的帽子被放在衣帽架上。
(6)He always stands first in his class.他总是全班名列第一。
6.注重“习语原则”与“开放选择原则”交融的原则
Sinclair(1991)认为,意义是由众多语块构成的,而这些语块通常是可预测的。基于这种现象,他(1991:110)提出了“习语原则”(idiom principle):“The principle of idiom is that a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices,even though they might appear to be analyzable into segments.”
研究固定短语的传统相当悠久,但短语常常被排除在语言的正常组织原则之外。Sinclair扩大了语块学的概念和研究范围。他认为,在某种程度上,词语的所有意义都存在于经常出现的词素序列中,并可以通过这些序列来识别。如果这些半预制的短语在语言中属于一般规则而不是规则的例外,他们就可以作为习语原则融入到语言正常组织原则中。
然而,习语原则不足以解释语言使用的所有例子。于是,Sinclair(1991:109-110)又提出了“开放选择原则”(open-choice principle),解释道:
This is a way of seeing language as the result of a very large number of complex choices.At each point where a unit is completed(a word or a phrase or a clause),a large range of choice opens up and the only restraint is grammaticalness...Virtually all grammars are constructed on the open-choice principle.
根据Sinclair(1991)的“双原则”观,我们可以得到以下启示:
(1)两种原则均可同时被用作观察语言和解释语言的方法。在使用语言时,语言使用者必须决定是否将此解释为一个语块还是一个系列的单个词项。就拿“I must confess”来说。根据“习语原则”,“I must confess”可作为一个词项,其中的任何一个词都不能用其他词来代替。我们可以解释为“我将讲述一件让你感到不悦或尴尬的事情”,如“I must confess I lost your car key”。如果将“I must confess”换成“he must confess”或“I must not confess”,则意思又变了;根据“开放选择原则”,“I must confess”当中可用其它词来代替,如“he must confess”、“I must not confess”和“I must run away”。这个语块可解释为“我有义务承认自己作错了一件事”(如:I drove through a red light that night,I must confess)。
(2)在实际交际中,先用“习语原则”来解释一个短语,然后再用“开放选择原则”。尽管任何时候都可以同时用这两种原则解释同一短语,但不能在同一语境下同时使用两种原则来解释,通常是有轻重先后之分的,以习语原则为优先,正如Sinclair(1991:114)所说:
For normal texts,we can put forward the proposal that the first mode to be applied is the idiom principle,since most of the text will be applied by this principle.Whenever there is good reason,the interpretative process switches to the open-choice principle,and quickly back again.Lexical choices which are unexpected in their environment will presumably occasion a switch;choice which,if grammatically interpreted,would be unusual are an affirmation of the operation of the idiom principle.
7.注重词汇与语法交融的原则
传统的语言描述方法是将词汇与语法区别对待。这种方法并非毫无根据,因为语言使用中的一些词项在从句中明显是没有意义的,但从语法上判断是正确的,如Chomsky提出的不可思议的句子“Colorless green ideas sleep furiously”。
Sinclair(1991)认为,区分词汇和语法是错误的,两者应该是个统一体,即Halliday(1993)所指的“词汇语法”(lexicogrammar)概念。这个概念表明,词汇和语法在现实生活中是密不可分的。Sinclair(1991)是根据语料库语言学的证据提出这一观点的,认为词汇和语法只有在开放选择原则下才是分开的。如果把两者的区分看作语言的核心组织特征,那么所有习语性、搭配性的表达法都成了异常的变体,这种描述明显是错误的。Sinclair(1991:103-4)对此进行了阐述:
The description of lexis and syntax leads to the creation of a rubbish dump that is called“idiom”,“phraseology”,“collocation”,and the like.If two systems are held to vary independently of each other,then any instances of one constraining the other will be consigned to a limbo for odd features,occasional observations,usage notes,etc.But if evidence accumulates to suggest that a substantial proportion of the language description is of this mixed nature,then the original decoupling must be called into question.The evidence now becoming available casts grave doubts on the wisdom of postulating separate domains of lexis and syntax.
根据Sinclair(1991)的“词汇语法”观,我们可以得到以下启示:
(1)在公式化语言研究中,我们不仅要考察某个语块的形式,更要考究该语块中包含的实义词所表达的意义。比如,“peel a pineapple”中的“peel”和“pineapple”所表达的意义。再比如,含有“gather”的语块就包括:“gather courage”(鼓气勇气)、“gather crops”(收庄稼)、“gather dust”(被搁置)、“gather experience”(逐步获得经验)、“gather flowers”(采花)、“gather information”(收集情报)、“gather oneself”(振作起来)、“gather roses”(寻欢作乐)、“gather speed”(逐渐加快速度)、“gather strength”(打起精神)等等。
(2)在公式化语言研究中,我们不仅要将某个语块当作一个整体来处理,而且还必须注重该语块所隐含的认知机制。比如,“invest a lot of time in sb”。其中就隐含了多种概念隐喻,其中包括:
(1)TIME IS MONEY,其中隐含的语块就有:“spend one’s time”、“budget one’s time”、“cost a lot of time”。
(2)TIME IS A LIMITED RESOURCE,其中隐含的语块就有:“have much time left”、“live on borrowed time”、“have enough time to spare”、“put aside some time”、“run out of time”;
(3)TIME IS A VALUABLE COMMODITY,其中隐含的语块就有:“waste one’s time”、“save much time”、“lose a lot of time”。
8.结语
综上所述,本文通过剖析Sinclair对语料库语言学的贡献,提出了语料库驱动下公式化语言研究必须遵循的六大原则:(1)注重语料新颖的原则;(2)注重意义与结构交融的原则;(3)注重“习语原则”与“开放选择原则”交融的原则;(4)注重意义单位描述的原则;(5)注重搭配框架的原则;(6)注重词汇语法为统一体的原则。
这些原则表明,研究者不仅需要细心考察逐个词项的语块特征,而且需要借助计算机获取频繁出现的单词序列。这些原则的提出,对于促进公式化语言的教学与研究在我国的广度和深度发展具有一定的借鉴与指导意义。
[1]Biber,D.,Conrad,S.& R.Reppen.Corpus-driven Approaches to Issues in Applied Linguistics[J].Applied Linguistics,1994(15):169-189.
[2]Biber,D.& F.Barbieri.Lexical Bundles in University Spoken and Written Registers[J].English for Specific Purposes,2007(26):263-286.
[3]Biber,D.A Corpus-driven Approach to Formulaic Language in English[J].International Journal of Corpus Linguistics,2009(14):275-311.
[4]Biber,D.,Johansson,S.,Leech,G.,Conrad,S.& E.Finegan.Longman Grammar of Spoken and Written English[M].Harlow:Longman,1999.
[5]Cortes,V.Lexical Bundles in Published and Student Disciplinary Writing:Examples from History and Biology[J].English for Specific Purposes,2004(23):397-423.
[6]Cortes,V.Teaching Lexical Bundles in the Disciplines:An Example from a Writing Intensive History Class [J].Linguistics and Education,2006(17):391-406.
[7]Halliday,M.A.K.Quantitative Studies and Probabilities in Grammar[M]//M.Hoey.Data,Description,Discourse:Papers on the English Language in Honour of John McH.Sinclair.London:HarperCollins,1993:1-25.
[8]Kettermann,B.Using a Corpus to Evaluate Theories of Child Language Acquisition[M]//Wichmann A.et al..Teaching and Language Corpora. London:Longman,1997:186-194
[9] Leech,G.The State of the Art in Corpus Linguistics[M]//K.Aijmer& B.Alternberg.English Corpus Linguistics:Studies in Honour of Jan Svartik.London:Longman,1991:8-29.
[10]Sinclair,J.Corpus,Concordance,Collocation[M].Oxford:OUP,1991.
[11]Sinclair,J.M.& A.Renouf.Collocational frameworks in English[M]//Aijmer& Altenberg.English Corpus Linguistics:Studies in Honour of Jan Svartvik.London:Longman,1991:128-144.
[12]Tognini-Bonelli,E.Corpus Linguistics at Work[M].Amsterdam:John Benjamins,2001.
[13]丁言仁,戚焱.词块运用与英语口语和写作水平的相关性研究[J].解放军外国语学院学报,2005(3):28-30.
[14]濮建忠.英语词汇教学中的类联接、搭配及词块[J].外语教学与研究,2003(6):438-445.
[15]王立非,张岩.基于语料库的大学生英语议论文中的语块使用模式研究[J].外语电化教学,2006(4):36-41.
[16]郑超,袁石红.从语块类型看中国学习者“...and...”语块的习得[J].外语教学与研究,2011(l):109-117.