APP下载

从鼓角铮鸣到万“码”奔腾
——编码与汉字信息传递标准化

2014-02-04五花肉

质量与标准化 2014年6期
关键词:汉语言编码语音

文/五花肉

从鼓角铮鸣到万“码”奔腾
——编码与汉字信息传递标准化

From Horn to Various Code——Encoding and standardization of Chinese Information Transmission

文/五花肉

——《墨子·卷十五》

Each officer has his own six flags with 8.34-meter-long staff and 5-meter-long width.When the enemies reach the bank of moat the defending troops hit the drum 3 times and hang a flag.When the enemies climb up the rampart by half the defending troops keep hitting the drum.In night,the defending troops replace flags with torches.The number of torches equal to the flags.If the enemies retreat,the defending troops will hang equal number of flags but won't hit the drum.

——胡适/《四角号码检字法》序

One stands for horizontal stroke;two and three stand for vertical stroke;four and five stand for left-falling stroke;six stands for dot and right-falling stroke;seven stands for cross;and eight and nine stand for left and right hooks.

从甲、金、篆、隶发展到楷书,再到信息时代的计算机中文字符,汉字伴随着中华文明而生、而盛。除了以纸为媒、手书印刷等传统记录传播方式外,中华民族也借助推进汉字字形的标准化,探索出以文字为内容、以编码为载体的汉字信息传递方式。

From oracle;inscriptions on ancient bronze objects; the lesser seal character;official script;regular script to Chinese characters in computer,Chinese characters have witnessed Chinese civilization development.Besides paper media,with the concept of standardization, Chinese people have developed encoding methods of Chinese characters for information transmission.

狼烟旌旗、鼓角铮鸣,这些词语惯常被用以指代沙场征战,它们既是千百年来军队交换情报、传递命令的常用方法,也是古人利用编码技术传递信息的最初萌芽。尽管中国古代兵家曾为这些通信手段制定了使用标准,但借此传递的信息却始终无法逾越人类的视听范围。

Beacon tower and horns were the general methods for information exchange in ancient battle field,which was the origin of encoding technology for information transmission.Even though ancient Chinese developed standards for the communication,the communication could not go beyond the limitation of seeing and hearing.

直到1925年,随着电报码在近代的引入和使用,上海人王云五在其基础上开发出具有检字功能的四角编码,最原始的汉字编码诞生了。虽然这种编码因为重码较多而无法作为计算机的输入编码,但它给人们的启示却有着划时代意义——利用汉字的某些特征加上有序符号,可以使汉字具备有序性、实现有理化,形成了汉字信息技术处理的雏形。电报码和四角号码也成为当时中国社会用字数字化和标准化的两大成就。

In 1925,with the introduction telegraph code in China,Shanghainese Wang Yunwu developed"four corner number code",which was the origin of Chinese character encoding method.Although the code was not suitable for computer input because of coincident code it inspired the concept,that is to say,we can encoding Chinese characters according to character pattern and font,which was the origin of modern information processing of Chinese characters.Telegraph code and four corner number code were the two achievements in Chinese standardization and digitization at that time.

进入20世纪80年代,随着《信息交换用汉字编码字符集基本集》(GB 2313-80)的发布,汉语言迈入信息化时代。在短短30年间,中国推出上千种汉字编码方法和数十种输入法,呈现出万“码”奔腾的局面。近年,更借助标准化的规范统一,形成了音码、形码、手写/语音等主流汉字输入法。“汉字信息处理与印刷革命”成为仅次于“两弹一星”的20世纪我国重大工程建设成就。

In 1980s,the publication of the Chinese national standards"Information technology-Chineseideograms coded character set basic set"(GB 2313-80)symbolized the informatization of Chinese character.In thirty years,Chinese people have developed thousands of encoding methods and dozens of Chinese input methods.In recent years,with the progress of standardization in Chinese character,the input methods have been integrated into major methods including tone codes;bar codes;handwriting and voice input. Chinese character information processing and revolution in printing were the greatest achievements second to"two bombs and one satellite"in the 20th century in China.

当下,随着大数据时代的开启和语音识别技术的突破,汉字信息处理技术又一次迎来了发展高峰。汉字语音识别技术广泛应用在IOS、安卓等智能手机平台;中文域名日益普遍,汉字及汉语言文化在“地球村”中的地位日渐提升。未来,伴随着中华民族的复兴,汉字必然会使中华文明在信息化社会绽放出更为夺目的光彩!

Recently,with the development oftechnology of massive datasets and speech recognition,Chinese character information processing witnessed the second development peak.Chinese speech recognition has been widely applied in smart phone operation system such as IOS and Android and,Chinese domain names become popular in internet.Chinese language and civilization are playing more and more important role in global village.With the resurrection of Chinese nation,we believe that Chinese character will make Chinese civilization rejuvenate in the information society.

(支持单位:上海市质量和标准化研究院)

猜你喜欢

汉语言编码语音
基于SAR-SIFT和快速稀疏编码的合成孔径雷达图像配准
《全元诗》未编码疑难字考辨十五则
国家级一流专业 汉语言文学
子带编码在图像压缩编码中的应用
魔力语音
基于MATLAB的语音信号处理
基于MQ3与MP3的价廉物美的酒驾语音提醒器
Genome and healthcare
对方正在输入……
西南大学汉语言文献研究所简介