APP下载

碎纸片拼接复原的算法研究

2015-04-26崔艳星

长治学院学报 2015年2期
关键词:复原纸片纸条

郭 伟,崔艳星

(长治学院 数学系,山西 长治 046011)

1 引言

破碎文件的拼接在司法物证复原、历史文献修复以及军事情报获取等领域都有着重要的应用。传统上,拼接复原工作需由人工完成,准确性较高,但效率较低。当碎片数量巨大,人工拼接很难在短时间内完成任务。随着计算机技术的发展,人们试图开发研究碎纸片的自动拼接技术,以提高拼接复原效率。现对以下问题进行讨论:对给定的来自同一页印刷文字文件的碎纸机破碎纸片(仅纵切),建立碎纸片拼接复原模型和算法,并对附件1给出的一页文件的碎片数据进行拼接复原。复原结果表达以图片形式表达。

2 复原算法思想描述

对于给定的来自同一页印刷文字文件的碎纸机破碎纸片(仅纵切);首先,通过观察这些字的特点,可以通过人工干预[1]将纸条左边全是整字、符号的碎纸条找出来,这就找出了这些纸条中的第一条。同样,我们可以确定出最后一条碎片。因为每条碎纸片中离边缘最近的那些字,有的是整字,有的是半字,有的是符号,有的是空格,所以对碎片的拼接有一定的难度.为了使问题简单化,我们联想到把文字转化为数字,其中这些数字用“0”或者“1”来代替;半个字用0表示,整个字、符号、空格都用1表示。

其次,我们把纸条中最左边转化成的数字按所在行形成左向量,同样最右边的那些文字形成一个右向量。这样每个碎纸条都可以用两个向量来表示。进而我们就把碎纸片文字的拼接问题转化为向量是否相等的问题了。如果一个纸条的左向量和另一个纸条的右向量相等,说明这两条纸条可拼接到一起,这个过程可利用c#编程完成排序[2]。

最后,根据上述排序结果,利用MATLAB程序可以检验c#编程运行结果并实现纸片的复原[3,4]。

3 复原算法的实现

首先,从对于给定的来自同一页印刷文字文件的碎纸机破碎纸片(仅纵切)的向量中找出初始向量。因为破碎前纸片的最左边的字都是整字,我们可以通过人工干预,很容易把都是整字的碎纸条找出来,即左向量是[0,0,…,0,0]的纸条找出来。这样我们就找出了初始向量。假如找到的初始向量为Li,则初始向量所在纸条的右向量为。

其次,寻找与Ri相等的向量Lj(其中i 不等于j)。假如Ri-Lj=0,即Ri与Lj相等,则第i-1条碎片与第j-1条碎片拼接成功.成功后再用Rj寻找下一条。假如Rj-Lk≠0,则说明Rj与Lk不相等,则第j-1条碎片与第k-1条碎片拼接不成功,不成功的话再用Rj-Lk是否等于0来寻找下一条,{其中,k≠j};假如Ri-Lj≠0则假如有Ri-Lh=0,则拼接成功.假如Ri-Lh≠0,则依次类推。

最后,由Rm-Ln=0.则碎纸片排序完成。对于上述过程。我们建立了如下的流程图:

例如:下图是给定碎片,共19支(从左至右编号000~018),现将其复原。

首先,找到其中左侧为整字的碎片,向量表示结果见附录1,上图中第一条碎片是第8条碎片,最后一条碎片是第6条碎片,所以在用该流程时,i=9,n=7。然后,利用MATLAB程序,检验结果并将碎纸片拼接复原。具体程序见附录2.复原图片结果分别见附录3。

4 附录

附录1

L1=[1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1]

R1=[0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,0,0,0,0,1]

L2=[1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,1,0,0]

R2=[0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0]

L3=[0,0,0,0,0,0,1,1,0,0,1,0,1,1,0,0,1,0,1,0,0,0,1,0,1,0,0,1]

R3=[0,1,0,0,0,0,0,0,1,1,0,1,0,1,0,1,0,1,0,0,0,1,0,1,0,0,0,0]

L4=[0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0]

R4=[0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0]

L5=[0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0]

R5=[0,0,0,0,1,1,0,1,0,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,0,0,0,0]

L6=[0,0,0,0,1,1,0,1,0,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,0,0,0,0]

R6=[0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,1,1,0,0]

L7=[0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,0,0,0,0,1]

R7=[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

L8=[1,1,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0]

R8=[0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1]

L9=[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

R9=[1,0,1,0,1,1,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0]

L10=[0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,1,1,0,0]

R10=[0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0]

L11=[0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0]

R11=[0,0,0,0,0,0,1,1,0,0,1,0,1,1,0,0,1,0,1,0,0,0,1,0,1,0,0,1]

L12=[0,0,1,1,1,1,1,0,1,1,0,0,0,0,1,1,1,1,0,0,0,0,0,1,0,0,1,0]

R12=[1,1,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0]

L13=[1,0,0,1,0,0,0,1,0,1,0,1,1,1,0,1,1,1,1,1,1,1,1,0,1,0,0,1]

R13=[0,0,1,0,1,1,1,0,1,0,1,1,0,1,1,0,0,1,0,1,1,1,0,1,1,1,1,1]

L14=[0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0]

R14=[1,0,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0]

L15=[1,0,1,0,1,1,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0]

R15=[1,0,0,1,0,0,0,1,0,1,0,1,1,1,0,1,1,1,1,1,1,1,1,0,1,0,0,1]

L16=[0,0,1,0,1,1,1,0,1,0,1,1,0,1,1,0,0,1,0,1,1,1,0,1,1,1,1,1]

R16=[0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0]

L17=[0,1,0,0,0,0,0,0,1,1,0,1,0,1,0,1,0,1,0,0,0,1,0,1,0,0,0,0]

R17=[1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,1,0,0]

L18=[0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1]

R18=[1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1]

L19=[1,0,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0]

R19=[0,0,1,1,1,1,1,0,1,1,0,0,0,0,1,1,1,1,0,0,0,0,0,1,0,0,1,0]

附录2

I1=imread ('008.bmp');

I2=imread ('014.bmp');

I3=imread ('012.bmp');

I4=imread ('015.bmp');

I5=imread ('003.bmp');

I6=imread ('010.bmp');

I7=imread ('002.bmp');

I8=imread ('016.bmp');

I9=imread ('001.bmp');

I10=imread ('004.bmp');

I11=imread ('005.bmp');

I12=imread ('009.bmp');

I13=imread ('013.bmp');

I14=imread ('018.bmp');

I15=imread ('011.bmp');

I16=imread ('007.bmp');

I17=imread ('017.bmp');

I18=imread ('000.bmp');

I19=imread ('006.bmp');

I=[I1,I2,I3,I4,I5,I6,I7,I8,I9,I10,I11,I12,I13,I14,I15,I16,I17,I18,I19];imshow(I)

附录3

[1]韩中庚,数学建模方法及其应用,第二版[M].北京:高等教育出版社,2009.

[2]王聚丰,杨启帆.数学建模案例集[M].北京:高等教育出版社,2003.

[3]司守奎,孙玺菁.数学建模算法与应用[M].北京:国防工业出版社,2011.

[4]贾海燕,碎纸自动拼接,http://www.docin.com,2013年9月15日[DB/OL].

猜你喜欢

复原纸片纸条
温陈华:唐宋甲胄复原第一人
两张纸条儿(上)
纸条大侦探
浅谈曜变建盏的复原工艺
听话的纸片
毓庆宫惇本殿明间原状陈列的复原
五张纸条
纸片也能托住水
讨厌体假日
神秘的纸条