SegNet-based first-break picking via seismic waveform classification directly from shot gathers with sparsely distributed traces

2022-03-30SnYiYunYueZhoToXieJieQiShngXuWng

Petroleum Science 2022年1期

Sn-Yi Yun ,Yue Zho ,To Xie ,Jie Qi ,Shng-Xu Wng ,*

a State Key Laboratory of Petroleum Resources and Prospecting,CNPC Key Laboratory of Geophysical Exploration,China University of Petroleum,Beijing 102249,China

b China Oilfield Services Ltd.,Tanggu,Tianjin 300452,China

c University of Oklahoma,ConocoPhillips School of Geology and Geophysics,Norman,OK,USA

Keywords:First-break picking Deep learning Irregular seismic data Waveform classification

ABSTRACT Manually picking regularly and densely distributed first breaks (FBs) are critical for shallow velocitymodel building in seismic data processing.However,it is time consuming.We employ the fullyconvolutional SegNet to address this issue and present a fast automatic seismic waveform classification method to pick densely-sampled FBs directly from common-shot gathers with sparsely distributed traces.Through feeding a large number of representative shot gathers with missing traces and the corresponding binary labels segmented by manually interpreted fully-sampled FBs,we can obtain a welltrained SegNet model.When any unseen gather including the one with irregular trace spacing is inputted,the SegNet can output the probability distribution of different categories for waveform classification.Then FBs can be picked by locating the boundaries between one class on post-FBs data and the other on pre-FBs background.Two land datasets with each over 2000 shots are adopted to illustrate that one well-trained 25-layer SegNet can favorably classify waveform and further pick fully-sampled FBs verified by the manually-derived ones,even when the proportion of randomly missing traces reaches 50%,21 traces are missing consecutively,or traces are missing regularly.

1.Introduction

Seismic data have been commonly used to understand the stratigraphic structures and to predict the distribution of oil and gas in the depth of thousands of meters.The first arrivals correspond to first breaks (FBs) contain important feature information.For instance,they can be adopted to remove the effect of the shallow fluctuating weather layer,which is the cornerstone of the subsequent seismic data processing and interpretation (Yilmaz,2001).Moreover,they can be utilized to estimate the source wavelet,which is the basis of the forward modelling and seismic inverse problem,and to build the shallow velocity model,which is useful for seismic migration and full waveform inversion(Yilmaz,2001;Li et al.,2019;Liu et al.,2020;Yao et al.,2020).Consequently,to detect the first arriving waves or to pick FBs is helpful for geophysicists.

Manual picking is the most straightforward method.In addition,it can introduce any prior knowledge,such as the spatial continuity of first arriving waves.Even for low-quality traces or missing traces,the brain networks can still interpret FBs by making use of the information at neighborhood traces.However,due to increasingly seismic traces,especially for wide-azimuth and high-density acquisition,manual picking is time-consuming and expensive.Various (semi-)automated first-break picking methods have been proposed and developed to improve the picking efficiency and to alleviate the pressure of interpreters.Each method has its own advantages and disadvantages.One traditionally automatically first-break picking technique is based on single-trace time-or frequency-domain methods operating on single-or multicomponent recordings from an individual receiver level,such as energy based methods (Coppens,1985;Sabbione and Velis,2010),entropy based methods (Sabbione and Velis,2010),fractal dimension based methods (Boschetti et al.,1996;Jiao and Moon,2000),and higher-order statistics based methods (Yung and Ikelle,1997;Saragiotis et al.,2004;Tselentis et al.,2012).In general,such methods can quickly and stably pick FBs on data with high signalto-noise ratio (SNR).Compared with the time-or frequencydomain methods,time-frequency domain methods are developed to increase one time dimension or one frequency dimension,and have thus more potential to stably pick FBs in the case of relatively low SNR and to even detect weak signals or weak first-break waves(Galiana-Merino et al.,2008;Mousavi et al.,2016).

In contrast to single-trace (semi-)automated methods,multitrace ones,such as cross-correlation based methods (Gelchinsky and Shtivelman,1983;Molyneux and Schmitt,1999) or template matching methods (Plenkers et al.,2013;Caffagni et al.,2016),make use of information on multiple receivers within the array simultaneously.In essence,multi-trace methods can pick FBs by taking maximum values of the cross-correlation or convolution results from trace(s) to trace(s).Therefore,the original multi-trace time-space amplitude information are generally utilized to pick FBs.Due to the usage of simultaneously multiple traces or highdimension (2D or 3D) information,the multi-trace cross-correlation or template matching based methods can identify weak signals or pick FBs at low SNR(Gibbons and Ringdal,2006;Caffagni et al.,2016) to some extent.However,they often do not adapt to the situations that the waveforms change with the trace and there exist missing traces or bad traces.

Most single-or multi-trace methods compute only one attribute for each time sampling point and subsequently select the locations with maximum-or minimum-value attributes as FBs.To improve the picking stability,some attributes including those in the time domain,frequency domain,time-frequency domain and/or timespace domain are considered to combine together to classify waveforms and to further pick FBs by artificial rules (Gelchinsky and Shtivelman,1983;Akram and Eaton,2016;Khalaf et al.,2018)or the traditional fully-connected artificial neural networks(ANNs)(McCormack et al.,1993;Gentili and Michelini,2006;Maity et al.,2014).Multi-attribute first-break picking approaches based on artificial rules are straightforward and explicable.However,they increase the number of parameters that need to be tuned,when multi-attribute generators are chosen.For multi-attribute approaches based on the fully-connected ANNs,they can automatically analyze and combine multiple attributes extracted from the data at the cost of learning a large number of network parameters.Nevertheless,they usually require extra attribute extraction and attribute optimization.

Convolutional neural networks(CNNs),which is one developed type of ANNs,have become popular in recent years by breaking previous records for various classification tasks especially in image and speech processing(LeCun et al.,2015;Russakovsky et al.,2015;Goodfellow et al.,2016;Silver et al.,2016;Esteva et al.,2017;Perol et al.,2018).At the cost of training big data,CNNs have the advantages to automatically extract features or attributes and meanwhile classify data.Moreover,they can be designed deep,due to the properties of local perception and weight sharing.In recent several years,CNNs were successfully introduced into the field of solid Earth geoscience including waveform classification (Bergen et al.,2019;Dokht et al.,2019;Pham et al.,2019;Wu et al.,2019).For instance,several 1D-CNNs are developed to effectively detect or classify earthquake phases or micro-seismic events,and to further pick FBs trace by trace or receiver by receiver (Chen et al.,2019;Wang et al.,2019;Wu et al.,2019;Zhu and Beroza,2019).Several 2D-CNNs are presented to effectively detect or classify seismic waveforms and to further pick FBs without involving the cases of missing traces or poor-quality traces (Yuan et al.,2018;Hu et al.,2019;Xie et al.,2019;Zhao et al.,2019;Liao et al.,2021).

In this paper,we apply the deep fully-convolutional SegNet,consisting of an end-to-end encoder-decoder automatic spatiotemporal feature extractor and a following pixel-wise classifier(Badrinarayanan et al.,2017),to automatically segment post-FBs data and pre-FBs background in common-shot gathers.At the cost of learning various representative shot gathers with sparsely distributed traces and the manually interpreted dense FBs,one well-trained deep SegNet can be yielded to pick FBs of any unseen gather having differently distributed missing traces or regularly distributed traces with a rough trace interval.The network architecture is designed deeply to contain a set of convolutional layers and pooling layers,in order to see more information of adjacent seismic traces and to further pick FBs especially in the positions of missing traces.The presented approach does not need preextracted and pre-chosen attribute steps,as well as an additional interpolation step.The interpolation is often required to reconstruct sparsely distributed traces and useful for subsequent seismic data processing and inversion (Jia and Ma,2017;Sun et al.,2019;Wang et al.,2019).

We begin this paper with the introduction of the motivation and general framework,as well as the description of data preparation,network architecture design,network model evaluation,model update and the optimum model application.One seismic dataset including 2251 shot gathers and the other dataset including 2273 gathers are then adopted to illustrate the effectiveness of the proposed 25-layer SegNet for classifying waveform and further picking FBs directly from shot gathers with different distributions of missing traces.Finally,a discussion and some conclusions are given.

2.Methodology

2.1.Motivation and general framework

First breaks (FBs) refer to the earliest arrival travelling time of seismic wave to receivers in the field of P-wave exploration.Therefore,there is only one FB in each seismic trace.Because some reflected waveforms below FBs,such as some separated waves,are similar to first-break ones,and there exists the unbalanced label data challenge,direct classification of FBs and other non-FBs categories may give rise to some false FBs (Yuan et al.,2018).As everyone sees,the waveform features above and below FBs are typically different in seismic common-shot gathers.There is only background or noise above FBs,which are usually spatially uncorrelated,while there are dominant signals below FBs,which are more spatially correlated.Consequently,we can divide shot gathers into two classes including pre-FBs background and post-FBs data.In this way,the boundaries between two classes are FBs.

It is important to obtain regularly and densely distributed FBs from sparsely distributed traces.One challenge of first-break picking is that trace intervals in each shot gather are often different,or trace intervals are regularly large.The quality of firstbreak picking for these sparsely distributed traces can decrease the quality of the shallow velocity-model building.Herein,we desire to predict the regularly dense waveform classification from shot gathers with randomly or regularly missing traces and to further pick FBs directly.More concretely,we want to find an optimal function to convert shot gathers with missing traces into mask data,where each trace is an idea step function.Due to missing traces,the optimal function should have the ability to capture information of adjacent seismic traces.The convolution-based networks have been demonstrated to be suitable to classify 2D or 3D images and easy to achieve more translation invariance for robust classification via series of high-dimensional convolution layers and pooling layers.Recently,the fully convolutional networks (FCNs)without fully-connected layers (FCLs),proposed by Long et al.(2015) in the context of image and semantic segmentation,was one popular architecture including at seismic exploration field.The FCNs have the advantages of achieving end-to-end pixel-wise classification and designing the network architectures flexibly,in contrast to the typical sliding-window CNNs involving one or several FCLs before the classification.Several variants of FCNs including U-net (Ronneberger et al.,2015),DeepLab (Chen et al.,2018),SegNet (Badrinarayanan et al.,2017) and DeconvNet (Noh et al.,2015),have been recently widely developed.Taking missing traces into account,the network should be designed relatively deep to see larger context and to utilize more information of adjacent seismic traces.From the viewpoint of the memory,we choose the SegNet architecture,which does not transfer the entire feature maps extracted in the encoder to the corresponding decoder but instead reuse pooling indices in the corresponding decoder,compared with U-net.

Taking SegNet as the function set,the regularly dense pixel-wise classification directly from a certain shot gather X with missing traces can be expressed as

where P represents the predicted probability with two channels indicating the background class and the signal-dominant class,and m represents the network parameters or variables in the function.SegNetrepresents a mapping function from X to P,involving some hyper-parameters such as the number of the convolution layers,the number of the (un)pooling layers,and the size of the filters in the convolution layers.As denoted in the SegNet architecture for waveform classification and first-break picking (Fig.1),SegNet consists of an end-to-end encoder-decoder feature extractor with trainable parameters and a following classifier without any learned parameters.The SegNet input is the preprocessed one-channel shot gather with missing traces,while the output is the two-channel map that is supervised by the given labels with one pre-FBs background class and the other post-FBs data class.

At the cost of training a large amount of labeled data,which can be obtained by manually interpreting dense FBs,SegNet can automatically extract statistically spatio-temporal features or attributes rather than pre-computing several attributes sensitive to FBs.The optimal extractor learned from various labeled data can yield the high-level features condensed by the previous series of layers with the property that there is large difference between the high-level feature values above FBs and those below FBs,but there is small difference among feature values at the same category including those in missing traces.As a result,these high-level features can be fed into a classifier to convert into the pixel-wise probability,which can be used to interpret each pixel as either pre-FBs background class or post-FBs signal-dominant class.

Our general framework for waveform classification and firstbreak picking from shot gathers with sparsely distributed traces is summarized as follows:

1) Data preparation.Divide data into three subsets including the training set,validation set and test set,and preprocess them with the same processing flow.

2) Network architecture design.Design the network architecture including the number of the(de)convolution layers,the number of the (un)pooling layers,the size of the filters,and the combination mode of different layers.

3) Network model evaluation.Define a loss function and several performance metrics to quantitatively evaluate the quality of a large number of models.

4) The optimum model choice.Find an optimum model by using the gradient descent and an early stopping strategy according to both the training set and the validation set.

5) The optimum model application.Apply the optimum model to unseen gathers in the test set to automatically extract the features,to meanwhile classify waveforms and to further pick FBs.

2.2.Data preparation

To obtain an effectively well-trained generalizable model or SegNet classifier and picker,three subsets including the training,validation and test sets are prepared.The training set is used for training the model or optimizing network parameters,as well as designing the learning rate,which is a hyper-parameter in the model update.The validation set is adopted to evaluate whether the learnt network is overfitting,and to determine the network hyper-parameters.The test set is considered to investigate the generalization ability of the well-trained model to unseen shot gathers.In both the training and validation sets,various typical diverse samples including shot gathers with different kinds of noise,different geometries of FBs,as well as different distributions of missing traces should be prepared as soon as possible and are cropped into the same size of both time dimension and space dimension.Accordingly,the labels used for supervising the network model are acquired by manually picking dense FBs and then segmenting gathers into one pre-FBs background class and the other post-FBs data class.Due to only two classes,the labels are quantified by binary maps.The ideal binary vectors(1 0) and(0 1)are designed to denote post-FBs data class and pre-FBs background class,respectively.An example of the input shot gather and the corresponding output label can be seen in Fig.1.To learn the model stably,fast and effectively,all samples in different sets should be further preprocessed mainly including outliers or heavy noise elimination,surface wave attenuation,filtering,cropping and normalization.

2.3.Network architecture design

There exists only one boundary line in each shot gather,but real missing traces can complicate seismic waveform classification and first-break picking.The shallow-layer model can mainly learn the special commonness of waveforms,but it is not easy to accurately classify some complex waveforms with strong lateral variation and those at the locations of relatively dense missing traces.Therefore,the SegNet model is still required to be relatively deep.On the whole,several down-sampling stages in the encoder and upsampling stages with the same number in the decoder,as well as a final pixel-wise classification layer,are designed,as shown in Fig.1.Inside each stage,the (de)convolution layer consists of convolution,batch normalization (BN) and rectified linear unit(ReLU) activation,and (un)pooling is placed between several (de)convolution layers.

For each convolution layer,as denoted in the blue boxes of Fig.1,its output can be expressed as

where the symbol*is a 2D convolution or deconvolution operator,Wlis the 2D convolutional filter or the weights at thel-th layer,Blis an aggregation of the bias that is broadcasted to each neuron in the feature or attribute map at thel-th layer,Xlis the input of the convolutionlayer,and Clis the outputof the convolution layer.Whenlis 1,Xlis the input shot gathers or X in Eq.(1).BN represents a differentiable batch-normalization transformation that normalizes the convolution output across a mini-batch(Ioffe and Szegedy,2015),and thus has the advantage of less internal covariate shift from shallowlayerstodeep layers.The activation function maxorthestateof-the-art ReLU represents an element-wise maximization operation,which can enable the SegNet model to be a universal nonlinear function approximator.It has been demonstrated that ReLU has the advantages of avoiding the notorious vanishing gradient problem and promoting the model sparsity(Glorot et al.,2011).

Fig.1.The SegNet architecture for seismic waveform classification and first-break picking directly from shot gathers with sparsely distributed traces.Blue boxes represent the convolution layers containing a convolution operation,a batch normalization operation and an activation operation,green boxes represent the pooling operators,red rectangles represent the unpooling operators implemented by pooling indices computed in the pooling step,and the yellow box represents the Softmax classifier.

For the pooling layer,as denoted in the green boxes of Fig.1,its output can be expressed as

where the symbol POOL represents a downsampling or pooling operator,and Dlrepresents the pooling result at thel-th layer,whose the size is smaller than that of the feature map Cl.We use max-pooling as a POOL operator to yield the maximum values by comparing the neighborhood of the feature map in a sliding window way,and record the locations of the maximum values with indices at the same time.The max-pooling layer can be used to typically compress the feature maps along all dimensions while retaining the most important information.Several stacked maxpooling layers can gradually reduce data dimension,and thus act as a strong regularization for the network(Goodfellow et al.,2016)to control overfitting.Furthermore,more pooling layers can achieve more translation invariance,which is useful for effectively classifying translational,rotational and deformed spatiotemporal waveform.It is also significant for adopting more pooling layers to see the larger input image context(spatial window),which is useful for the classification of shot gathers with sparsely distributed traces due to the usage of more spatial neighborhood of statistical information.However,the pooling operator can correspondingly lead to a loss of spatial resolution of the feature maps.The increasingly lossy of boundary details is not beneficial for spatiotemporal waveform segmentation where boundary delineation corresponding to FBs is vital.Here,the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear unpooling.This eliminates the need for learning to upsample.

At last,a pixel-wise Softmax classifier is adopted to convert the output of the attribute extractor into probability distribution overKdifferent possible classes:

where Pu,v,nrepresents the normalized probability that thev-th data point in then-th shot is classified into theu-th category,Yu,v,nrepresents the predicted feature representation using an encoderdecoder extractor composed of convolution layers and pool layers,andKrepresents the number of classes.Because the Softmax classifies each data point independently,the predicted P is also aKchannel.In our case,there are only two classes including noisedominant background and signal-dominant waveforms,soKequals to 2.The predicted segmentation corresponds to the class with the maximum probability at each pixel.

2.4.Network model evaluation

Because the quantity of the pre-FBs background class is comparable to that of the post-FBs data class,the binary cross-entropy criterion in an original form is chosen as the model evaluator or the loss function and is given as follows:

whereNrepresents the total shot number,Vrepresents the total sampling point or pixel number in a shot gather,m represents the model or network parameters,Qk,v,n(k=1,2) represents the desired or true binary probability at thev-th data point of then-th shot,and Pk,v,nrepresents the predicted binary probability via Eq.(1).The closer the predicted probability distribution to the ideal one,the smaller is the cross-entropy value orO(m),and therefore the better the built model mapping X to P,as denoted in Eq.(1).

To quantitatively evaluate the well-trained model obtained by minimizing Eq.(5)or the cross entropy as well as its generalization performance,the accuracy between the predicted classification results and the manually interpreted ones or the consistency between these two results are evaluated by using the following expressions:

where true positive TPn(n=1,2,…,N) represents the number of the sampling points in then-th shot that correctly predict signaldominant class below FBs,true negative TNnrepresents the number of the sampling points in then-th shot that successfully predict background above FBs,and false negative FNnrepresents the number of the sampling points in then-th shot which fail to predict signal-dominant class.The accuracy Anandagive the percentage of correctly classified sampling points in then-th shot gather and all shot gathers in the training,validation or test sets,respectively.The recall rate Rnandrgive the percentage of correctly classified signaldominant sampling points in the whole signal-dominant sampling points of then-th shot gather and all shot gathers in the training,validation or test sets,respectively(Powers,2011).

2.5.Model update

A linear combination of the model update Δmt-1from the previous iterationt-1 and the negative gradient of the loss function is used to update the model mtat the current iterationtas follows:

where μ represents the momentum coefficient or the weight of the last update Δmt-1,ηtrepresents the learning rate or the weight of the negative gradient at the iterationt,represents the gradient of the loss function with respect to the model or network parameters,∂O(mt)/∂Ytrepresents the gradient of the loss function with respect to the input of the Softmax classifier or the output of the encoder-decoder feature extractor,∂O(mt)/∂Ytis equal to the difference between the predicted probability P and the true probability Q,∂Yt/∂mtrepresents the gradient of the output of the encoder-decoder feature extractor with respect to the model parameters,and ∂Yt/∂mtcan be explicitly calculated by using the back-propagation algorithm(Rumelhart et al.,1986).Herein,the gradient term ∂O(mt)/∂mtis approximately calculated by randomly selecting only a small subset(mini-batch) ofwith improvements on efficiency and convergence,which is the idea of a mini-batch stochastic gradient descent (SGD) algorithm (LeCun et al.,1998).

When the loss for the training set decreases,whereas the validation loss increases(known as overfitting)or remains stable for a certain period of iterations,the training or the update is stopped.The finally well-trained modelSegNetwith the optimal network parameters moptis saved as the automatic classifier.

2.6.The optimum model application

The saved optimum model can be used to convert any unseen shot gather Xtestin the test set into a two-channel normalized probability map with the following formula:

When the predicted binary probability in a certain pixel of P is closer to(1 0),we classify the corresponding input sampling point in X into the post-FBs signal-dominant class.On the contrary,when the predicted probabilitiy is closer to (0 1),we classify the input sampling point into the pre-FBs background class.In an ideal case,the sampling points above FBs should be one class,while those below FBs are the other.Consequently,the boundary between the two classes or the segmentation line in the classification map can be located,i.e.,FBs are picked.Furthermore,we take the maximum of each position in the two-channel probability map to obtain a single-channel probability map.It can be utilized to statistically and directly evaluate the quality for waveform classification along with first-break picking to some extent.The smaller the probability in the single-channel probability map,the lower the accuracy of waveform classification,or the more unreliable waveform classification.In an ideal case,the single-channel probability curve for each trace including missing traces should be one impulse function,where there is only one minimum.The position of the minimum value in each trace can be regarded as the FB.Consequently,we can also pick FBs by locating the minimum values of the single-channel probability map one trace by one trace.

3.Examples

The proposed method is tested on two land real datasets.The two adjacent datasets are from two monitoring receiver lines approximately 500 m apart.The topography of the real working area is complicated with a surface elevation difference up to 500 m.One dataset including 2251 common-shot gathers is used for training and validation,where 1751 gathers are randomly chosen for training while the rest for validation.The other containing 2273 common-shot gathers is adopted for test.Each common-shot gather with the sampling time of 4 ms consists of 168 traces and is cropped into 550 time samples.For the two datasets,all the sources are the same type of explosives,and the distances among all receivers in each monitoring line are approximately the same.All common-shot gathers are preprocessed by implementing outliers or heavy noise elimination,surface wave attenuation,filtering,and normalization.For each processed gather with 92400 sampling points,the labels are made by manually picking FBs and then classifying pre-FBs background and post-FBs data.

For our examples,25 layers including 18 convolution layers,3 pooling layers,3 unpooling layers and 1 Softmax layer are combined into our network architecture by trial and error,as shown in Fig.1.Except the number of convolution filters in the final convolution layer of the decoder is 2,the number of convolution filters in the other convolution layers is 64.For each convolution kernel,the size is fixed to 3 × 3 with stride of 1 (i.e.,overlapping window).Following the convolution layer,the max-pooling operator is performed over a 2 × 2 pixel window,with stride of 2 (i.e.,nonoverlapping window),and thus the resulting output through the pooling operator indicted by the green box is sub-sampled by a factor of 2 along both the time direction and the space direction.

In all experiments,the 25-layer network is trained by using SGD with a momentum coefficient of 0.9 and a mini-batch size of 5 shot gathers.The learning rate starts from 0.01,and is reduced by a factor of 0.2 every 5 epochs.The full pass of the training algorithm over the entire training set using mini-batches is defined as an epoch.The maximum number of epochs for training is assigned to 200,and the cross-entropy loss error is set to 0.001.For all experiments,the biases are initialized as zero,and the weights are initialized as noise in the normal distribution with a mean of zero.The experimental machine configuration includes an Intel core i9-9900k CPU (Central Processing Unit) and an Nvidia GeForce RTX 2080Ti GPU (Graphics Processing Unit).

To investigate the generalization performances of the welltrained SegNet classifier and picker from the training and validation sets without missing traces or regularly dense distributed traces,we apply it to three kinds of unseen testing data sets including 2273 common-shot gathers with different ratios of missing traces.For the well-trained network,the total classification accuracy of both the 1751 testing gathers and the rest 500 validation gathers are over 99%.It may be noticed that it takes about 170 s to classify 2273 gathers without parallel computing.Fig.2a-c shows the accuracy distributions of 2273 testing common-shot gathers with 0% missing traces (or without missing traces),with 10%randomly missing traces of each gather and with 50%randomly missing traces of each gather (i.e.,84 traces from 168 traces are randomly selected to fill in zero-valued amplitude) versus the source location.It can be observed from Fig.2a that the classification accuracy of most gathers in the original 2273 testing gathers without missing traces reach 98%,while the accuracy of only about 50 gathers is less than 98% and their corresponding sources are generally close to the black monitoring line.As denoted in Fig.2b and c,the well-trained network from the training and validation sets without missing traces presents the poor prediction effect on 2273 testing gathers with 10% and 50% randomly missing traces,having the accuracy generally below 80%.It is obvious that the welltrained network from shot gathers without missing traces or with regularly sampled traces cannot be generalized to classify shot gathers with sparsely randomly distributed traces well.It can also be seen that the gathers far from the black monitoring line have the lower accuracy than those close to the monitoring line.In addition,the more missing traces,the lower the accuracy.

In order to generalize to classify shot gathers with missing traces well,we generate an improved well-trained SegNet classifier and picker from the training and validation sets with 50% randomly missing traces,and apply it to various testing common-shot gathers with different distributions of missing traces including 0% missing traces to test its performances.For the new well-trained model,the total classification accuracy of 1751 testing gathers and the rest 500 validation gathers are 99.68% and 99.60%,respectively.Fig.3a-c shows the accuracy distributions of 2273 testing gathers without missing traces,with 50% randomly missing traces of each gather and with 50% regularly missing traces of each gather versus the source location.The testing sets of Fig.3a and b are the same as those of Fig.2a and c,respectively.It can be observed from Fig.3 that the new well-trained model from shot gathers with missing traces has the excellent generalization ability with the accuracy of almost all gathers over 98% even in the case of regularly missing traces.The accuracy of only several gathers with sources located near the black monitoring line is relatively low.Fig.4 displays careful comparisons among the classification accuracy of 500 shot gathers in each testing dataset of Figs.2a and Fig.3a-c.For all these four dataset cases,gathers more than 95% in each testing set of 2273 gathers have the accuracy over 98%.

Fig.2.The accuracy distributions of 2273 testing common-shot gathers with 0%(a),10%(b) and 50% randomly missing traces of each gather (c)based on the well-trained SegNet from the training and validation sets without missing traces versus the explosive source location.Colored dots denote source locations with the classification accuracy of single gather,and all receivers are located on the black monitoring line.In(a),the hollow pentagram and black hollow circle in the lower left corner,and the blue hollow circle in the upper right corner denote the locations of the 15th shot,243rd shot and 2213rd shot,respectively.The well-trained model from shot gathers without missing traces cannot be generalized to classify shot gathers with missing traces well.

Fig.3.The accuracy distributions of 2273 testing common-shot gathers with 0% missing traces (a),with 50% randomly missing traces of each gather (b) and with 50% regularly missing traces(c)based on the new well-trained SegNet from the training and validation sets with 50%randomly missing traces of each gather versus the explosive source location.The well-trained model has favorable generalization ability with the accuracy of almost all gathers over 98%.

Fig.4.Comparisons among the classification accuracy of the first 500 common-shot gathers arranged from the smallest value to the largest value corresponding to 2273 testing shot gathers in Figs.2a and 3a-c.The characters 0%MT/0%MT,50%MT/0%MT,50%MT/50%MT and 50%MT/50%regular MT in legend represents four dataset cases of Figs.2a and 3a-c,respectively.The accuracy of gathers more than 95% in each testing set is over 98%.

Fig.5.The recall distributions of 2273 testing common-shot gathers versus the explosive source location corresponding to four dataset cases of Figs.2a and 3a-3c,respectively.The recall of almost all gathers in each testing set reach 97%.

To focus on investigating the classification ability of post-FBs signal-dominant waveforms,we calculate the recall of 2273 testing common-shot gathers or four well-generalized cases of Figs.2a and Fig.3a-c,as shown in Fig.5a-d.For these four dataset cases,the recall of almost all gathers reach 97%,even in the cases of missing traces with different distributions.Compared with three other cases,the generalization accuracy of the new well-trained SegNet classifier and picker from samples with 50% randomly missing traces to 2273 testing shot gathers without missing traces are relatively most dependent on geometry related to the locations of sources.The lower the recall,the higher detection error the signal-dominant samples.In general,the wrong detection happens near sharp top FBs and weak reflection areas.

To illustrate the effect of seismic waveform classification and first-break picking based on the new well-trained SegNet model from the training and validation sets with 50% randomly missing traces more clearly and intuitively,we select two representatively original common-shot gathers including the 243rdshot gather and the 2213rdone.These two chosen original gathers share different first-break shapes,different types of first-break waves,different distances from the source to the monitoring receiver line,different degrees of weak reflection below FBs,and different kinds of noise.Furthermore,we focus on investigating the generalization ability to the gathers with randomly,regularly and several special distributions of missing traces generated from these two representatively original common-shot gathers.

Fig.6 shows comparisons among the classification results and first-break picking results of the gathers with five distributions of missing traces made from the 243rdoriginal shot gather(Fig.6f).It can be clearly observed that the classification results of the gathers without missing traces(Fig.6a),with 50%randomly missing traces(Fig.6b),with only missing traces on the left (Fig.6c),with only missing traces on the right(Fig.6d)and with 50%regularly missing traces (Fig.6e) are in good accordance each other,even for the locations of six consecutive missing traces (Fig.6b and d) between the 135thtrace and the 140thtrace.In addition,five segment lines(red lateral lines)between the pre-FBs background class(light blue part)and the post-FBs signal-dominant class(dark blue part)or the corresponding picked FBs are basically consistent and continuous along the space direction.The single-channel probability maps corresponding to Fig.6a-e are shown in Fig.7a-e,and local zoomed plots of low-value probability zones of the middle 84thtraces in Fig.7a-e are shown in Fig.8.It can be seen that the maximum probability curve at every trace approximates a pulselike function with a small width of low-probability values,which are related to the locations of FBs,as well as the uncertainty of waveform classification and first-break picking.Fig.9 shows the differences between the automatically picked FBs and the manually picked FBs versus the trace number.The average error between the automatically picked FBs from 168 traces in each gather case and the manually picked ones is less than two samples,suggesting that the automatically picked FBs are close to the manually picked ones.It can be inferred from Fig.9 that the differences of FBs do not so strongly depend on the distributions and the locations of missing traces.

Fig.6.Comparisons among the classification results and first-break picking results of the gathers with five distributions of missing traces(a-e)generated from the 243rd original common-shot gather (f) based on the new well-trained SegNet.The representatively cropped original gather in (f) contains background,industrial electrical interference (red arrows),one very weak post-FBs reflection area(red rectangle)similar to the pre-FBs background,the surface wave(blue rectangle),near-offset direct waves and far-offset refracted waves.The classification results are overlaid by the picked FBs (red lateral lines) and the seismic amplitude in (a-e),where white vertical zones denote the positions of missing traces.The classification accuracy for five cases is over 99%.Furthermore,the picked FBs are basically consistent each other and spatially continuous.

Fig.7.Comparisons among the single-channel probability maps corresponding to Fig.6a-e.The black and white represent the low-and high-value probability,respectively.Red lateral dashed lines represent the manually picked FBs from the common-shot gather of Fig.6f,which are regarded as the reference.It can be observed that almost all referenced FBs are located in low-value zones of five probability maps.

Fig.10 shows comparisons among the classification results and first-break picking results of the gathers with five distributions of missing traces made from the other 2213rdoriginal common-shot gather (Fig.10f) based on the new well-trained SegNet from the training and validation sets with 50% randomly missing traces of each gather.The corresponding source is far from the monitoring receiver line,as denoted in the position of the blue hollow circle in Fig.2a.One can clearly see that the classification results of five gathers with different distributions are good.Moreover,the corresponding picked FBs are visually consistent and spatially continuous,even for the locations at 13 consecutive missing traces(Fig.10b and d)between the 30thtrace and the 44thtrace except for the 35thtrace and the 37thtrace.Fig.11a-e displays the singlechannel probability maps derived by predicting the original gather and the other four generated gathers,and Fig.12 displays their corresponding local zoomed plots of low-value probability zones in the middle traces.It can be observed that the probability curve at each trace looks like a pulse function with several continuous low-value probabilities.The differences between the automatically picked FBs and the manually picked FBs versus the trace number are shown in Fig.13.The average error between the automatically picked FBs from 168 traces in each gather case and the manually picked ones is not more than two samples.In contrast to five cases derived from the 243rdoriginal common-shot gather,the first-break picking of five distributions gathers generated from the 2213rdoriginal gather are relatively unstable and more dependent on the distributions of missing traces,as indicated in Figs.9 and 13.

Fig.13.The differences between the automatically picked FBs (or red lines of Fig.10a-e) and the manually picked FBs versus the trace number.The errors of the automatically picked FBs from 168 traces including the locations of missing traces are commonly less than 4 samples,and the average errors corresponding to five cases are 1.1429,1.5536,0.9345,1.6548 and 1.7083 samples,respectively.

To illustrate why the new well-trained SegNet classifier and picker can directly classify seismic waveform and subsequently pick FBs at the positions of missing traces,and to further test its generalization ability,we choose another representative commonshot gather in the test set and make a special gather obtained by replacing 21 consecutive traces of the original gather from the 74thto 94thtraces with zero.The chosen representative original gather(Fig.14c) is the 15thshot located at the location of the hollow pentagram in Fig.2a.As the classification results(Fig.14a and b)of the 15thoriginal shot gather and its corresponding modified gather(Fig.14d) indicate,the classification result predicted by applying the well-trained SegNet model to a shot gather with 21 consecutive missing traces is similar to that of the original fully-sampled gather,and the classification accuracy for the pair of gathers are 99.78%and 99.52%,respectively.As well,the picked FBs are suitable agreement even in the locations of missing traces.Furthermore,we extract 64 features in the 17thconvolution layer and 2 features in the 18thconvolution layer obtained by applying the new well-trained Seg-Net to two given gathers of Fig.14c and d,and implement careful comparisons,as shown in Figs.15 and 16.It can be observed that 64 deep-layer features automatically extracted from the gather with 21 consecutive missing traces are similar to the corresponding ones extracted from the original gather,even in the corresponding positions of consecutive missing traces.Moreover,as the depth of the convolution layer increases from the 17thconvolution layer to the 18thconvolution layer,the features automatically extracted from the gather with missing traces become closer to the corresponding ones extracted from the original gather.It is noticeable that the new well-trained SegNet model can extract features with obvious boundaries between the shallow background part and the deep signal-dominant part after both the 17thand 18thconvolution layers,even in the presence of 21 consecutive missing traces.

Fig.8.Local zoomed plots of low-value probability zones of the middle 84th traces in Fig.7a-e.The characters 0%MT,50%MT,Left MT,Right MT and Reg MT in legend represent five cases of single-shot gathers with 0%,50% randomly,only the left,only the right and regularly missing traces,as indicated in Fig.6a-e,respectively.There are only about three continuous time samples with the classification probability less than 0.9,regardless of the distributions of missing traces.In addition,the differences among the minimum probability for five cases are obviously not more than one time sample.

Fig.9.The differences between the SegNet-based automatically picked FBs or the red lines of Fig.6a-e and the manually picked FBs versus the trace number.The errors of the automatically picked FBs from 168 traces including the locations of missing traces even six consecutive missing traces are commonly less than four time samples,and the average errors corresponding to five cases are 0.7411,1.7589,1.2054,1.3185 and 1.6756 time samples,respectively.

4.Discussion

Fig.10.Comparisons among the classification results and first-break picking results of the gathers with five distributions of missing traces(a-e)generated from the 2213rd original common-shot gather (f) based on the new well-trained SegNet.The representatively cropped original common-shot gather in (f) contains background,industrial electrical interference and asymmetric fluctuant direct waves,which are obviously different from the characteristics of Fig.6f.The classification accuracy for five cases reaches 99%.Moreover,the corresponding picked FBs are basically consistent and continuous along the space direction.

Fig.11.The single-channel probability maps corresponding to Fig.10a-e.The black represents low-value probability,and red lateral dashed lines represent the manually picked FBs from the common-shot gather of Fig.10f.It can be seen that the manually picked FBs are located in or near low-value zones of all probability maps.

Fig.12.Local zoomed plots of low-value probability zones of the middle 84th traces in Fig.11a-e.There are about four continuous time samples with the classification probability less than 0.9.In addition,the differences among the minimum probability for five cases are from 0 to 5 time samples.In contrast to Fig.8,there exists relatively large uncertainty of low-value probability zones.

We adopt a large number of preprocessed seismic common-shot gathers as the network input,and use the corresponding pre-FBs background and post-FBs data labels to supervise the network output to gain a well-trained end-to-end encoder-decoder extractor.The extractor can automatically extract features instead of pre-computed ones.The automated extractor constructed by stacking a series of artificial neurons enables FBs to be picked in a manner analogous to how a human would pick them.Although the well-trained automated extractor is built at the cost of learning big data,it is easy to prepare abundant seismic shot gathers and the corresponding labels by segmenting FBs,which can be picked manually.Nevertheless,it is important to learn representative and diverse shot gathers especially including those with different distributions of missing traces.Under the premise that the basic condition is met,our well-trained fully-convolutional SegNet consisting of an end-to-end encoder-decoder feature extractor and a following pixel-wise classifier adapts to automatically pick FBs of various unseen shot gathers fast even acquired in different geometries,such as different trace intervals,different distributions of receivers,or different number of receivers.

We choose the data within the same source type of explosives in our examples in order to focus on investigating the effectiveness of SegNet-based automated first-break picking directly from shot gathers with sparsely distributed traces including randomly missing traces,consecutively missing traces and regularly missing traces.Furthermore,we attempt to account for the reason why our presented SegNet-based model can classify waveform and pick FBs in the locations of missing traces.It is also worth investigating that simultaneously learning a large number of seismic data from various sources,such as explosives,vibroseis or air gun may enable the well-trained SegNet to cause a broader generalization.

The design of labels is not unimportant for learning one datadriven waveform classifier and first-break picker.From the viewpoint of balancing the number of different classes of labels,it should be appropriate to label seismic data with one class on post-FBs data and the other on pre-FBs background,which are just segmented by FBs.The design of network architecture and the performance of the network learning are dependent on the choice of the input size.We choose various cropped 2D seismic data with relatively large size of both time and space dimensions,which can be roughly determined by balancing post-FBs signal-dominant samples and pre-FBs background samples as well as the computer memory.And we input them into the end-to-end SegNet architecture to implement pixel-wise classification.The SegNet architecture can be designed deeper and more flexibly to see more adjacent information.It is useful for a deep SegNet to classify waveforms and pick FBs of shot gathers with consecutive missing traces.However,the end-to-end SegNet architecture accompanied by the cropped gather input with relatively large-size time-space dimension requires more preprocessing steps,such as the removal of some local noise (e.g.,outliers or 50-Hz industrial electrical interference).The space dimension of the input determines the maximum space limitation that the network can see.The space size of the reception field,which can be calculated by using the number and size of pooling layers and convolution layers,determines the real space range that the network can see.The larger the reception field,the more spatiotemporal information the network can capture.Besides directly learning spatiotemporal amplitude information,some other attributes extracted from seismic data can be considered as the input independently or simultaneously to learn other well-trained models for classifying waveform and further picking FBs.Furthermore,the ensemble learning for first-break picking is also worth investigating.

We should consider the distribution of missing traces to design the number of pooling layers,especially for the number of consecutive missing traces.In contrast to convolution layers,the use of more pooling layers can increase the receptive field more rapidly and can help see a larger range of waveform,so more local statistics can be utilized to perform waveform classification and first-break picking.Moreover,more pooling layers along with convolution layers can achieve more translation invariance,which is useful for effectively classifying translational,rotational and deformed spatiotemporal waveform and further picking FBs.However,the usage of more pooling layers may blur some details related to FBs.It may be one main reason why there is the low-value probability of a certain width near the FB in each trace of the onechannel probability map,which is derived by the SegNet-based two-channel probability output map.This paper adopts only 2D pooling and 2D convolution filters to gradually include as much spatiotemporal statistics of neighborhood as possible in the input preprocessed common-shot gathers.However,it can be readily extended to a higher dimension to utilize more information.For instance,2D pooling and 2D convolutional filters can be directly replaced by 3D or higher dimensional those to implement waveform classification and first-break picking in multiple shots and multiple domains or by increasing a frequency or scale dimension.

Fig.14.Comparisons between the classification results (a-b) of the 15th original common-shot gather (c) and its corresponding modified gather (d) involving 21 consecutive missing traces based on the new well-trained SegNet.The original cropped gather in(c)contains weak background,weak industrial electrical interference,the relatively weak post-FBs reflection area(red rectangle),the relatively weak residual surface wave(blue rectangle),near-offset direct waves and far-offset refracted waves.The pixel-wise classification of the shot gather with 21 consecutive missing traces is similar to that of the original gather without missing traces,and the picked FBs are also consistent even in the locations of missing traces.

We do not impose any constraint of some common-sense factors related to spatio-temporal relationship of FBs on learning SegNet,such as small time delays between adjacent traces and time increase with offset distance,but the picked FBs can visually meet these factors,as illustrated in our examples.It suggests that the proposed well-trained 25-layer SegNet model has the ability to learn these common-sense factors from various seismic gathers with the corresponding labels.In turn,whether some commonsense factors or some experience of interpreters can be considered as a constraint or a guide to integrate into the process of network learning and what are its roles in classifying waveform and picking FBs is also worth studying.

Fig.15.Comparisons between 64 features (a-b) in the 17th convolution layer automatically extracted by applying the new well-trained 25-layer SegNet to two given gathers of Fig.14c and d,as well as the histogram of the absolute difference between b and a.Although there are differences between the corresponding features near the missing traces,these corresponding features appear similar.Furthermore,there are obvious boundaries between the shallow background part and the deep signal-dominant part in each extracted feature even in the presence of 21 consecutive missing traces.

Fig.16.Comparisons between 2 features (a-b) in the 18th (or final) convolution layer automatically extracted by applying the new well-trained 25-layer SegNet to two given gathers of Fig.14c and d,as well as the histogram of their absolute difference between b and a.There are relatively weak differences between the corresponding features near the missing traces.In addition,there are clear boundaries between the shallow background part and the deep signal-dominant part in all extracted features.

We derive a single-channel probability map by taking the maximum of each binary probability in the output two-channel normalized probability map.For inputting any unseen shot gather,the resulting single-channel probability curve at each trace including the locations of missing traces is one impulse-like function,where there are the low-value probability of a certain width near FBs.The locations of the minimum values within the lowvalue probability zone can be interpreted as FBs,which correspond to the locations of the segmentation lines interpreted by directly using the output two-channel normalized probability map.There are properties for the single-channel probability map that continuous small-value probability always appears near FBs,and the minimum probability values are related to the locations of FBs.It can probably be inferred that the smaller the minimum probability value in the derived single-channel probability map,and the wider the range of-low-probability values,the more uncertain the first-break picking.This may be adopted or further quantified as an additional quality-control factor to evaluate the performance of the already-trained network model from a statistical point of view,and in turn to optimize the network architecture.Furthermore,some continuously low-value probability in the derived single-channel probability map indicates one time range for picking FBs at least.It may provide a new way to combine the SegNet-based first-break picker with a certain traditional picker.

5.Conclusions

We present a SegNet-based waveform classification and firstbreak picking method.The method can pick FBs from shot gathers with sparsely distributed traces or missing traces directly.At the cost of learning various shot gathers with randomly missing traces,our well-trained fully-convolutional SegNet model can automatically extract spatio-temporal features and classify seismic waveform at the same time.Due to the learning of samples with missing traces and usage of several stacked pooling layers and convolution layers,the well-trained 25-layer SegNet model can adapt to automatically pick FBs of various unseen shot gathers acquired in different geometries,even when the proportion of randomly missing traces reaches 50%,21 traces are missing consecutively,or traces are missing regularly.As our examples show,even when the proportion of randomly or regularly missing traces of each gather in the testing set reaches up to 50%,the classification accuracy of 2273 gathers is almost over 97%,and the average error between the SegNet-based automatically picked FBs and the manually picked ones is about two time samples.Although we do not impose any constraint of some common-sense factors related to spatio-temporal relationship of FBs on learning SegNet,FBs including at the locations of missing traces can be picked with good lateral continuities.Our work can be readily extended to simultaneous multi-shot or multi-domain learning and first-break picking,3D amplitude learning and first-break picking by increasing a frequency or scale dimension,as well as simultaneous multiple-attribute learning and first-break picking,which are also our future work.

Acknowledgements

This work was financially supported by the National Key R&D Program of China (2018YFA0702504),the Fundamental Research Funds for the Central Universities (2462019QNXZ03),the National Natural Science Foundation of China(42174152 and 41974140),and the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX 2020-03).

Petroleum Science

2022年1期