张, 郑晓东, 李劲松, 路交通, 曹成寅, 隋京坤
1 中国石油勘探开发研究院, 北京 100083 2 中石化石油工程地球物理有限公司, 北京 100029
1 引言
在地震储层特征描述和检测的技术中,地震相分析(Nivlet, 2007)是一种不可或缺的方法.人工地震相分析(Saggaf et al., 2003)需要投入大量的时间,更要求解释人员具有足够的经验,并结合一定的分析方法才能完成.随着勘探精度的提高,高密度、宽方位采集技术的应用,地震数据进入大数据时代,如何从海量的地震数据中提取地质特征信息成为研究的热点,近年来非监督地震相分析方法(de Matos et al., 2007)逐渐受到重视,其主要借鉴模式识别的原理,通过由地震数据得到的地震属性以及其他一些辅助信息来刻画地质体.非监督地震相分析方法(例如,Marroquín et al., 2008; Roy et al., 2010; 李芳等, 2014)完全基于数据驱动,极大地降低了人为因素的干扰,即便对于不熟悉当地地质情况的人员,也能得到一个较为客观准确的结果.
从模式识别的角度来讲,地震数据具有连续性、冗余性和一致性的特点.地震相分析本质上就是对地震数据进行分类,既可以是有监督的,也可以是无监督的.无监督的分类也叫聚类,常见的有K均值聚类方法,其主要是以k为参数,将n个对象分为k簇,使簇内具有较高的相似度.而相似度的计算主要是用每个对象到簇中心的欧几里德距离来度量.Coléou(2003)利用K均值对地震相进行聚类分析,但是该方法(韩家炜和坎伯,2001;张学工,2010)需要预先定义生成簇的数目,易受“噪声”和孤立点的影响,且不能保持数据的拓扑结构.对于连续、低维、高噪声的地震数据来说,K均值聚类不能取得较好的效果.自组织映射是由Kohonen(1982,1990)提出的一种非监督模式识别方法,其主要的思想是将数据投影到一个低维空间,以获得更为直观的理解,但它只能根据经验来判别类别数,选择最优地震属性来刻画地震数据中的地质特征.刘力辉等(1996)利用自组织神经网络进行地震微相的划分.陆文凯和牟永光(1998)利用自组织神经网络来追踪地震同相轴.Steeghs和Drijkoningen(2001)利用联合时频分析的方法来描述由地下反射信息中的微小变化会引起频率成分的波动.穆星(2005)利用自组织神经网络优选几何属性对地震相进行自动识别.Marcílio等(2007)引入小波变换的方法来识别瞬时地震道的每个地质信息段中的奇异值,该方法更易于实现SOM聚类.粒子群算法是由Kennedy和Eberhart(1997)、Eberhart和Shi(2001)对鸟类的觅食行为的研究时提出的一种基于群智能的优化方法计算. 岳碧波等(2009)通过三点滤波的方法改进粒子群的更新速度,从而使粒子更快速收敛.朱童(2011)通过前后粒子的相互作用改进了粒子群方法,从而提高了收敛速度.Liu等(2011)提出基于粒子群的多属性动态聚类方法,该方法主要利用群体智能优化方法来消除K均值聚类中奇异值对中心点选择的影响.
2 自组织映射
芬兰学者Kohonen在研究联想记忆和自适应学习机时提出自组织映射,这种网络的学习机制与人类的大脑皮层上的分区自组织现象具有很多的相似性,是一种无监督的竞争性学习方法.如图1a SOM网络由输入层和输出层构成,层内无连接,层间全连接.输出层的神经元呈矩形或六边形规则地排列在同一层上.输入层中的学习样本按顺序地输入到网络空间中进行训练,输入样本的每个维度通过权值与网络上的神经元相连,而神经元之间没有相互连接,通过欧式距离来度量神经元对于输入样本的敏感度,最为敏感的神经元成为最佳匹配单元.假设训练样本x(t)的维度为N,计算i位置上的神经元向量的连接权值mij(t)与训练样本xi(t)之间欧几里德距离,即
图1 自组织神经网络结构图(a)及神经元权值调整示意图(b)Fig.1 The structure of Self-Organizing Map and the sketch map of adjusting weight
图2 邻域函数图(横轴代表邻域半径,纵坐标代表神经元的响应函数;左图和右图是两种不同的神经元响应函数,R代表邻域半径)Fig.2 Neighborhood function
3 基于自组织映射和粒子群优化的K均值聚类
① 优选能够反映地质目标的敏感属性,对地震属性进行预处理,初始化SOM网络;
② 将地震属性逐个代入网络中进行训练,根据式(1)计算神经元与样本的距离,确定最佳匹配单元,根据式(2)更新权值;
③ 如达到一定的迭代次数或是权值稳定不再改变,则训练完成,否则t=t+1,回到②;
图3 算法流程图Fig.3 Flow diagram of improved method
⑤ 按照式(4)和式(5)更新所有粒子的速度和位置,再次进行聚类,并记录属性样本点与神经元的对应关系,如果满足迭代条件则算法结束,如果不满足,带入步骤④重新计算;
⑥ 将最终得到的神经网络聚类结果反映射到原来的样本空间,得到已分好类的属性样本,此时地震相聚类完成.
4 数值模拟
图4a为一个四层介质模型,第一、三和四层的速度分别为4000m·s-1,4000m·s-1和5000m·s-1,第二层的速度在横向上发生变化,分别为3000m·s-1,3300m·s-1和3700m·s-1. 为了证明该方法对于地层横向介质岩性发生变化的有效分析,本文采用一个主频为30Hz,4ms采样的雷克子波,正演得到如图4b所示合成地震记录剖面.用本文中介绍的方法将每一道数据代入到图3的算法中进行聚类分析(该模型将时间样点默认为属性),图5a为利用本文方法得到的结果,横向上的三种变化清晰地通过三种类别展现出来,其中,不同的颜色代表不同的类别.如图5b所示,该模型在迭代初期,目标函数就近似于零,但随着迭代次数增加,目标函数值始终为零,说明该方法的稳定性好且收敛.总之,本文方法能够识别这种由于储层横向上的变化导致的地震相的不同,最终将其区分开来,即使在边界处也被很好地区分开来,同时该方法具有较好收敛特性及很好的稳定性.
图4 地质模型(a)及其合成地震记录(b)Fig.4 Geological model (a) and its synthetic seismogram (b)
图5 聚类结果(a)和收敛函数(b)Fig.5 Cluster results (a) and convergence function (b)
图6 基于SOM和粒子群属性动态聚类结果(a) SNR=25 dB; (b) SNR=10 dB; (c) SNR=2 dB; (d) SNR<1 dBFig.6 Cluster results of seismic attribute using PSO-SOM
图7 某商业软件聚类结果(SNR<1)Fig.7 Cluster results with some commercial software(SNR<1)
5 实际资料处理
图8 地震剖面Fig.8 Seismic section
图9 时间厚度图Fig.9 The figure of time thickness
图10 地震属性图(a) 瞬时振幅; (b) 瞬时频率; (c) 瞬时相位; (d) 相干曲率.Fig.10 Seismic attribute map(a) Instantaneous amplitude; (b) Instantaneous frequency; (c) Instantaneous phase; (d) Coherent curvature.
图11 自组织神经网络分类结果和地震属性聚类图Fig.11 Cluster results in SOM and cluster results of seismic attribute using SOM
图12 某商业软件结果Fig.12 Cluster results with some commercial software
图13 粒子群聚类结果Fig.13 Cluster results of seismic attribute using PSO
图14 基于SOM和粒子群属性动态聚类结果Fig.14 Cluster results of seismic attribute using PSO-SOM
图15 计算时间对比Fig.15 Computing time
6 结论
Unsupervised seismic facies analysis technology based on SOM and PSO
ZHANG Yan1, ZHENG Xiao-Dong1, LI Jin-Song1, LU Jiao-Tong2, CAO Cheng-Yin1, SUI Jing-Kun1
Seismic facies, as the mappable 3D seismic units composed of groups of reflections whose parameters differ from those of adjacent facies units, represent seismic reflections to macro characteristics of sedimentary facies. Seismic facies analysis technique is to describe and interpret the seismic reflection parameters, such as configuration, continuity, amplitude, and frequency, within the stratigraphic framework of a depositional sequence. As a key step in the seismic interpretation workflow, seismic facies analysis determines so much information on depositional process, environment and ultimately can predict potential reservoir only from seismic data in the absence of well data. When the geological information is incomplete or nonexistent, seismic facies analysis is called non-supervised and is performed through unsupervised learning or clustering algorithms. Although unsupervised seismic facies analysis is an effective technique for reservoir prediction, the big seismic data are processed slowly with the traditional methods.In order to overcome the defects of traditional ways which easily fall into the minimum value and lead to the inaccuracy of the cluster of seismic data, this paper proposes a new method to analyse seismic facies combining the Self-Organizing Map (SOM) and the Particle Swarm Optimization (PSO). In this paper, we firstly select the sensitive attribute which can reflect the geological target and normalize the seismic attribute and initialize the SOM network. The reason why we choose SOM is that it can compress a large number of redundant seismic data into a smaller number. As one of the most promising mathematical techniques applied to non-supervised pattern classification, SOM has the characteristics of keeping the topology structure of the original samples. Secondly we will train the seismic attribute one by one in the network, compute the distance between neuron and sample according to Euclidean distance, confirm the optimum matching unit, and update the weight according to renewing criterion. If it reaches to a certain iteration or the weight trends to stabilization, the training is finished, otherwise, return to last step. After the previous data compression, we will improve the K-means cluster using the global optimization of the PSO, which is initialized with a group of random particles (solutions) and then search for optima by updating generations. In every iteration, each particle is updated by following two “best” values. The first one is the best solution (fitness) it has achieved so far, which is called pbest displaying the best location. Another “best” value that is tracked by the particle swarm optimizer is the best value, obtained so far by any particle in the population, which is a global best and called gbest indicating the best swarm. Based on the well trained SOM network, we can find out a proper clustering divide using PSO optimizing method directly, which minimize the fitting degree from which we can get the minimum Euclidean distance then we record down the pbest and gbest. On the basis of results in the last step, we can update all the particles′ velocities and locations and cluster them again and keep a record of the corresponding relationship between attribute samples and neurons. If it reaches to the iterative condition, then quits the algorithm, otherwise, returns to last step to recompute them. Finally, we reverse mapping the clustering results into the original samples space to acquire the well classified attribute samples.In the theoretical model, we design a four layers medium model with a horizontal velocity change in the second layer which means the formation lithology variation in the lateral. With the Ricker wavelet forward modeling, we get the synthetic seismogram and then use the algorithm in the last paragraph mentioned to cluster them, which defaults the time samples as the attribute. Based on our method, three kinds of variations in the lateral can be clearly displayed, from which different colors represent the different classifications. Meanwhile, this method has a very good stability and convergence when the iteration times increase the objective function value still is near zero. For the purpose of testing the robustness to noise, we add noise of different Signal to Noise Ratio (SNR) to model, including SNR=25 dB, SNR=10 dB, SNR=2 dB and SNR<1 dB. From the results, we can find that when SNR>1 the clustering performance is very good and the horizontal variation is discriminated very well by the distinct boundaries. Even if SNR<1 we still can detect the changes basically, and the results can be referenced for our research although there are some clustering errors. Especially we select the SNR<1 models to be processed by certain commercial software from which the clustering is completely disordered and disappointing. It indicates that we can get stable results using our method when the seismic data quality is bad. According to the application to real data from Tarim Basin, the seismic facies map based on our method and SOM are better than the commercial software, the border and fault zone of reef facies are depicted more clearly. From comparing the seismic faices to the wells located in the area, we can find out that oil wells W2, W21, W22 and W23 are distributed in the red color area which implies the potential oil reservoir and dry wells W25 and W28 are located in the brown belt which doesn′t have oil production processed by our method. At the same time, our improved algorithm can greatly shorten the calculation time from comparison of consuming time between our algorithm and commercial software.The traditional seismic facies analysis methods are usually restricted by the massive seismic data because of very low computational efficiency. In our paper, we try to solve the problem and propose a new multi-attribute clustering method combining the SOM and PSO. We make full use of the SOM advantage of compressing redundant seismic data into a smaller number and keeping the original topology structure, and then improve the K-mean clustering by the PSO global optimizing characteristic. The theoretical model and real data show that our algorithm can realize the compression of the seismic data effectively, and provide a more accurate global solution. For seismic facies prediction, it does well in both the calculation efficiency and the accuracy.
Self-organizing feature map (SOM); Particle Swarm Optimization (PSO); Unsupervised seismic facies analysis; Clustering
