Adaptive fault diagnosis of sucker rod pump systems based on optimal perceptron and simulation data

2022-06-02XiaoXiaoLvHanXiangWangZhangXinYanXinLiuPengChengZhao

Petroleum Science 2022年2期

Xiao-Xiao Lv，Han-Xiang Wang，Zhang Xin，Yan-Xin Liu，Peng-Cheng Zhao

College of Mechanical and Electrical Engineering，China University of Petroleum，Qingdao，266580，China

Keywords:Sucker rod pump Dynamometer card Adaptive fault diagnosis Sucker rod dynamics Output metering

ABSTRACT A highly precise and timely diagnosis technology can help effectively monitor and adjust the sucker rod production system (SRPS) used in oil wells to ensure a safe and efficient production.The current diagnosis method is pattern recognition of a dynamometer card (DC) based on feature extraction and perceptron.The premise of this method is that the training and target data have the same distribution.However，the training data are collected from a field SRPS with different system parameters designed to adapt to production conditions，which may significantly affect the diagnostic accuracy.To address this issue，in this study，an improved model of the sucker rod string (SRS) is derived by adding faultparameter dimensions，with which DCs under 16 working conditions could be generated.Subsequently an adaptive diagnosis method is proposed by taking simulated DCs generated near the working point of the target SRPS as training data.Meanwhile，to further improve the accuracy of the proposed method，the DC features are improved by relative normalization and using additional features of the DC position to increase the distance between different types of samples.The parameters of the perceptron are optimized to promote its discriminability.Finally，the accuracy and real-time performance of the proposed adaptive diagnosis method are validated using field data.

1.Introduction

The sucker rod production system (SRPS) is the most widely used artificial lift method in the petroleum industry (Bahbahani et al.，2016;Wilamowski and Kaynak，2000).In production practice，because of the complex and harsh working environment，the SRPS may work under abnormal conditions，which may lead to reduced production or even equipment damage(Dave and Mustafa，2017).Therefore，automatically diagnosing the faults in the SRPS has attracted research attention.The dynamometer card (DC)，a close curve representing the load versus the displacement of the polished rod in one working cycle，is one of the firsthand dynamic data in oilfield production.The shape of the DC can effectively reflect the down-hole working conditions of the SRPS;thus，these crucial data are quite useful in diagnosing the SRPS (Reges et al.，2015;Zheng et al.，2020).

Currently，the fault diagnosis of the SRPS is done through pattern recognition of the DC based on a classifier(Li et al.，2013a;Zheng et al.，2019a)，the specific process of which is shown in Fig.1.The core of the diagnosis is the construction of the classifier.As shown in Fig.1，the construction process has three main parts:labeled DC set，feature extraction method of the DC，and perceptron.First，the raw data of a DC {(si，Fi)} are normalized，and its features{Tn}are extracted and combined with the label data of the fault type{FT}to form training data.The training data are then used for perceptron training to determine its parameters.If the perceptron is an artificial neural network(ANN)，its parameters are the input and output weights and thresholds([w，b]ihand[w，b]ho).The ANN with the determined parameters can be considered a generalized fitting function fwb(·)，i.e.，a classifier.When diagnosing，the features of the test DC (Tntest) are extracted and inputted to the generalized function to calculate the label data FTtest，and then，the fault type of the SRPS can be diagnosed by discrimination.

Fig.1.Process of current diagnosis method.

Given the importance of the SRPS fault diagnosis，several advanced methods have been used to address this issue.Xu et al.，(2007) directly adopted the data points of a DC as input to a selforganizing competitive network for classification.Wu et al.，(2011) used three layers of a wavelet packet to decompose DC into eight energy eigenvectors and used radial basis function(RBF)networks for fault classification.Li et al.，(2013b) extracted the curve moment (CM) of a typical polished rod DC as features and adopted the improved support vector machine(SVM)to classify the DC.Gao et al.(Gao et al.，2015)selected five appropriate features as DC features and used the extreme learning machine (ELM) to diagnose the down-hole working condition.Wu et al.，(2013)extracted seven invariant moments of the DC and used SVM for the fault diagnosis of the SRPS.Li et al.，(2015) presented an automatic clustering algorithm for the classification of DCs.Zhong and Zou (2016) mapped the curve of a DC onto a gray matrix for a gray scale analysis.Zheng and Gao(2017)adopted seven geometric features of the DC and valve working positions as features，and introduced a continuous hidden Markov model (HMM) for fault diagnosis.Li et al.，(2018a) used the Freeman chain codes (FCC) as DC features and adopted the online sequential ELM for pattern classification.Zhou et al.，(2019) extracted the Fourier descriptors(FDs) of a DC curve and applied improved RBF networks to construct a classification model.

The above studies focused on promoting the representation ability of the features and the classification ability of the diagnosis model to obtain better diagnosis results.However，the quality of the training data，which also significantly affect the diagnosis results，has been ignored.Currently，the training data are DCs collected from different SRPSs，because it is difficult for one oil well to experience all the working conditions.The system parameters of the SRPSs designed to adapt to various production conditions are different，which may directly lead to different distributions of the training and target data of the diagnosis model.Thus，the diagnostic accuracy is significantly affected.Only a few studies have addressed this problem.Zhang et al.(Zhang et al.2019) applied dictionary-based transfer subspace learning to construct a transform matrix，whereby training and target data could be transferred into a common low-dimensional subspace.This method still requires a large amount of labeled data from field DCs，which increases manpower and material resources.Zheng et al.，(2019b)proposed an advanced strategy to generate DCs under six typical working conditions.However，in industrial production，there are nearly 20 types of fault conditions.

Hence，a method to eliminate the difference in the distributions of the training and target data of the diagnosis model is worth studying.Moreover，once some faults occur，the SRPS must be stopped，because these faults may lead to no production and high energy consumption，or even damage the system equipment.While other faults only affect the production efficiency and will not damage the machinery，the index determining whether the SRPS under these conditions could continue to work is the production rate.Therefore，the output metering method of an oil well under fault conditions is also useful.

Based on the above analysis，an approach to obtain the DC of a polished rod and operating process of the SRPS under fault conditions is proposed considering the influence of faults on the SRPS.This paper presents a strategy where the simulated DC set of the target SRPS，instead of the field DC set，is used as training data to solve the aforementioned distribution difference problem.Additionally，the measured DC is used to identify the improved SRS model parameters and predict the production rate of the SRPS.The contributions of this paper are summarized as follows.(I) An improved SRS model is derived by adding dimensions of the fault parameters，with which DCs under 16 working conditions could be generated.(II)An adaptive diagnosis method is proposed by taking simulated DCs generated near the working point of the target SRPS as training data.(III) A quantitative analysis method based on parameter identification of the improved SRS model is presented to predict the production rate of the SRPS under fault conditions.

2.Improved model of sucker rod production system

The operation of the SRPS is a complex process involving multivariable coupling，and it is difficult to calculate the load of a polished rod and output characteristics，particularly under fault conditions.Hence，based on the influence mechanism of the fault on sucker rod dynamics and the pump valve，an improved model of the SRPS is established to generate DC and predict the liquid production rate.

2.1.Model of sucker rod production system under normal conditions

2.1.1.Motion equation of sucker rod element

The pumping characteristics of the SRPS mainly depend on the longitudinal vibration of the sucker rod.According to Hooke's law and equilibrium equation(Gibbs，1963)，the motion equation of the rod element is derived as follows:

where A，E and ρ represent the area，Young's modulus and density of sucker rod，respectively;u represents the displacement field of sucker rod;frfrepresents the viscous friction;frtrepresents the Coulomb friction of tubing;g represents acceleration of gravity;α represents the deviation angle;s represents the position of node;t represents the time.

The friction of the well fluid can be expressed as:

where crfrepresents actual viscous damping coefficient of well liquid;ξvrepresents the ratio of the velocity of the fluid column bottom to the velocity of the plunger;μ represents viscosity of well fluid;Atirepresents internal area of tubing;fdrepresents logic variable of traveling valve opening;Afrepresents the area of fluid column;Aprepresents area of plunger;μlrepresents the viscosity of liquid;μgrepresents the viscosity of gas;Hlrepresents the liquid holdup.

The Coulomb friction force between the tubing and the sucker rod(Lv et al.，2020a) is as follows:

where crtrepresents the friction coefficient between the sucker rod and the tubing;Nnand Nbrepresent the normal force and abnormal force between the sucker rod and the tubing，respectively;I represents the inertia moment of the sucker rod section;k and T represent the curvature and torsion of the well trajectory，respectively;θ represents the azimuthal angle;ρfrepresents the density of well fluid.

2.1.2.Surface boundary conditions

At the wellhead，the upper end of the SRS is connected to a polished rod;thus，the surface boundary condition of the differential equation (Eq.(1)) can be expressed as:

where SArepresents the displacement of polished rod.

In an oil field，beam pumping units are commonly used to drive the SRS and oil pump.Based on the characteristics of a four-link mechanism，the motion law of the polished rod driven by a beam pumping unit can be deduced (Xing and Dong，2015) as follows:

where，Larepresents length of fore arm;θ0represents initial angle between the back arm and the vertical direction;Lbrepresents length of back arm;J represents distance between crank moving end and beam rotation center;L represents the length of pitman;r1represents crank length of beam pumping unit;H and G represent the vertical distance and horizontal distance between crank rotation center and beam rotation center，respectively;φ0represents the initial angle between the crank and the vertical direction;ω1represents the rotation speed of crank.

Belt pumping units are also used in oil production for energy savings.In one cycle，a belt pumping unit has two movement states:a reversing motion near the dead points and a linear motion(Luan et al.，2011).Hence，we have:

where ω2represents the rotation speed active wheel;r2represents radius of active wheel;φ represents the rotation angle of active wheel.

2.1.3.Continuous conditions

To adapt to the production conditions，sucker rods with different diameters and materials are always used.At the junction of the sucker rod，both the displacement and load are equal.Thus，the continuous conditions are as follows:

where Lirepresents the depth at the changing section or material of sucker rod，Eiand Airepresent Young's modulus and area of ithstage sucker rod，respectively;pfrepresents the fluid pressure;c represents number of sucker rod segments.

2.1.4.Pump boundary conditions under normal condition

The lower end of the SRS is a plunger，and its displacement and load depend on the operating characteristics of the pump valve.The pump boundary conditions consist of plunger equilibrium equation，barrel equilibrium equation，and state equation of the fluid in the pump chamber.Assuming that the tubing expands evenly without vibration，the pump boundary conditions can be derived as follows:

where pdrepresents discharge pressure;p* represents pump pressure;Fbprepresents the force exerted by the pump barrel on plunger;Lprepresents pump setting depth;atrepresents the logical variable of tubing anchoring;ξtrepresents the deformation coefficient of tubing string;vbrepresents the velocity of barrel;Φ represents the state function of the fluid in the chamber，which reflects the relationship between the fluid volume and the pressure.However，in this study，Φ represents the relationship between the plunger displacement，velocity of the barrel，and pump pressure.

Based on the valve state，the state function Φ can be divided into the following three cases.Case I.When both the standing valve(SV)and the traveling valve (TV) are closed，the plunger is stationary relative to the pump barrel under normal conditions.Case II.When the SV is open and TV is closed，the pump pressure is equal to the suction pressure.Case III.When the SV is closed and the TV is open，the pump pressure is equal to the discharge pressure.Thus，we have:

where vp=∂(Lp，t)/∂t is velocity of plunger;fsrepresents logic variable of standing valve opening;Δpdrepresents the pressure drop of traveling valve;psrepresents submergence pressure;Δpsrepresents the pressure drop of standing valve.

2.1.5.Solution technique of model

(1) Initial condition

The initial moment is when the polished rod and plunger are at the bottom dead point.At this moment，both the SV and TV are closed.Hence，

where u0represents the displacement of SRS due to its own floating weight.

(2) Solving method

The finite difference method proposed by Schafer and Jennings(1988) is adopted to solve the SRPS model.The transition conditions of the valve state are required when a computer program solves the model automatically;these can be obtained on the basis of the specific opening and closing conditions of the valve.

2.2.Down-hole boundary conditions of fault

When a fault occurs in the SRPS，the operating characteristics of the pump valve and the interaction between the tubing and the SRS may affect the operation of the SRPS.Specifically，these faults will change the state equation of the fluid in the chamber (Φ) and the force of the tubing on the sucker rod (Fbp)，thereby affecting the motion of the plunger and the load acting on the polished rod.

Fig.2.Down-hole conditions of various faults.

In industrial practice，the common faults in the SRPS include standing valve leakage (SVL)，traveling valve leakage (TVL)，gas influence，insufficient liquid supply (ILS)，plunger moving out of barrel (POB)，tubing leakage，top pump bumping (TPB)，bottom pump bumping (BPB)，pump sticking，rod parting，and abnormal properties of well fluid.Moreover，the various down-hole working conditions and the corresponding fault parameters are shown in Fig.2，where Sprepresents the stroke of the plunger;sprepresents the displacement of the plunger;Lsrepresents the anti-impact distance;sgrepresents the equivalent height of the gas in the barrel;qt，qs，and qprepresent the leakage rates of TV，SV，and plunger，respectively;Sobrepresents the maximum distance that the plunger can move in the pump barrel;Strepresents the restricted distance of the TPB;Sbrepresents the restricted distance of the BPB;Hsrepresents the submergence depth，reflecting the ability to supply liquid;Ltlrepresents the leakage position of the tubing;Lrsand Srsrepresent the location and distance of the sticking point，respectively;Lrprepresents the length of the part rod.

2.2.1.Gas influence and insufficient liquid supply

As shown in Fig.2b，if the well fluid at the pump inlet contains free gas，the compression and expansion of the gas mainly affect the pump pressure in caseI of the valve states.When the gas content is high，neither valves can be opened;this condition is called gas lock.However，when the gas content of the discharged fluid is very low，and the submergence is lower，the volume of the gas expands rapidly during loading，and only a low amount of liquid enters the pump chamber.Therefore，almost all the gas volume needs to be compressed before discharge，and the pump pressure is almost constant，which is the case of the ILS.

To establish the down-hole boundary condition of the gas affected，the following assumptions are made:(1)Compression and expansion of the gas proceed isothermally and polytropically.(2)The gas-liquid mixture evenly enters and leaves the pump chamber.

The gas in the chamber changes the characteristics of the pump valve only in case I，and in this case，the pump chamber forms a closed space.Based on the gas state equation，the change in the pressure of the gas in the pump chamber can be expressed as:

where κ represents the polytropic exponent of gas.

As shown in Eq.(12)，sgis introduced as an additional variable;thus，an additional constraint equation is required to make the boundary condition complete.The constraint equation corresponding to a change in sgis as follows.

where R and Rdrepresent the volume ratio of gas to liquid at the inlet and outlet of the pump，respectively.

In addition，the initial value of sgis as follows:

The state equation of the well fluid containing the gas in case I(Φ1) can be obtained by combining Eqs.(12)-(15).

2.2.2.Leakage of pump valve

The valve cannot be completely closed due to wear of the seat or ball and dirt or sand in the cover.The excessive anti-impact distance and stroke length may cause the plunger to move out of the pump barrel.Moreover，the wear and corrosion of the plunger and pump barrel will cause the gap between them to increase.All these cases may lead to pump valve leakage.When the standing valve leakage is high，the state in which the TV cannot open is called the standing valve failure (SVF).When the traveling valve leakage is very high，the state in which the SV cannot open is called the traveling valve failure(TVF).Since the influence of plunger leakage is the same as that of TV，the plunger leakage due to wear and corrosion is superimposed on the leakage of the TV.Similarly，the leakage of the pump barrel is superimposed on that of the SV.

The instantaneous leakage of the pump under normal conditions can be obtained through a theoretical analysis of the concentric cylinder leakage(Wang et al.，2019).

where ξprepresents plunger leakage coefficient;Dprepresents the diameter of plunger;lprepresents the length of plunger;δ represents the clearance between plunger and pump barrel.

When the plunger moves out of the pump barrel，the clearance of the plunger leakage is equal to the clearance between the plunger and the tubing.

where Dbrepresents the inner diameter of pump barrel;Dtirepresents the inner diameter tubing.

To fully reflect the influence of pressure difference on the valve leakage rate，the leakage coefficient and leakage exponent are introduced to characterize the valve leakage(Lv et al.，2020b).

where ζsand esrepresent the leakage coefficient and exponent of standing valve，respectively;ζtand etrepresent the leakage coefficient and exponent of traveling valve，respectively;ξp0represents the plunger leakage coefficient of specified pump.

The state function Φ1of the well fluid in the chamber under leakage conditions can be obtained based on the fact that the space released by the plunger movement is equal to the volume of the leaked fluid.Thus，we have:

2.2.3.Pump bump and sand production

When the anti-impact distance is too small or if sand，mud，rubber，or other solid foreign bodies are deposited in the pump barrel，the plunger may bump against the pump barrel during processing.In the TPB state，the standing valve will close immediately，and the load of the SRS will increase dramatically.In comparison，in the BPB state，the traveling valve will close at once，and the load of the SRS will be sharply reduced.Except during the impact period，the working process of the SRPS is consistent with that under normal conditions.Assuming that the plunger is temporarily fixed with the pump barrel when bumping occurs，the state function of the fluid in the chamber Φ can be updated as follows.

When there is no pump bumping，the force Fbpis the Coulomb friction，which is known.However，when bumping occurs，Fbpis a variable.Consequently，an additional equation is required to form a complete constraint.However，when the plunger bumps against the barrel at the top，the pump pressure is equal to the discharge pressure，whereas at the bottom，the pump pressure is equal to the submergence pressure.Thus，we have，

where FCrepresents the Coulomb friction between plunger and pump barrel.

Under conditions of sand production，an additional random high-frequency load may be generated between the pump barrel and the plunger.Based on the random function，an additional impulse load is defined as follows.

where nrrepresents the number of random impulse loads and is a random integer;rd1，rd2and rd3represent random real numbers between 0 and 1;FrArepresents the maximum amplitude of pulse load;t0irepresents the initial time of pulse load selected randomly;Tsprepresent the logical variable of sand production occurrence.

2.2.4.Tubing leakage

When the tubing is damaged due to abrasion or corrosion，the fluid in the tubing will flow back into the casing，resulting in no production and low discharge pressure.

where Lv tl represents the vertical depth at the tubing leakage location.

2.2.5.Pump sticking

Pump sticking is the state in which the sundries in the pump barrel or tubing almost completely limit the movement of the lower part of the SRS.When the SRS moves to the sticking point[u(Lrs，t)-u0(Lrs)≤-Srs]，the SRS is stationary relative to the tubing at the sticking point，and the tubing is stretched and compressed by the varying tension of the SRS.In this working state，the pump does not work，so the pump boundary conditions of the SRPS model are changed as follows:

where vtsrepresents the velocity of the tubing at the sticking point.

2.2.6.Rod parting

The parting rod is a common type of fault due to mechanical wear，fatigue damage，and corrosion damage.Under this fault condition，only the broken sucker rod reciprocates in the well，without production.For the model of the SRPS，the spatial range of the SRS displacement field u(s，t) is reduced from [0，Lp] to [0，Lrp]，and the pump boundary condition is changed as follows:

2.3.Result of improved model of SRPS

Based on the improved model of the SRS established above，the working conditions of the SRPS can be characterized using the following fault parameter X.

where Tps，Trpand Ttlrepresent the logical variables of occurrence of pump sticking，rod parting and tubing leakage，respectively.

For a given SRPS，the working process and the output performance of the SRPS can be obtained.

where Y represents the system parameter of the SRPS，including system structure parameters (configurations of the SRS，tubing，pump，well trajectory，and surface equipment model) and production parameters (stroke of the polished rod，pumping speed，and submergence depth);Perepresents a set of variables that characterize the working process and the output performance of the SRPS.

The two crucial elements of Peare the polished rod load and liquid production rate，and the calculation formulae are as follows.

where LPRsrepresents simulated polished-rod load;Δs represents the length step;Qlrepresents liquid production rate;Blrepresents the compressibility coefficient of well liquid;N represents the pumping speed;Afbrepresents the area of bottom fluid column.

3.Adaptive diagnostic method of rod pump system

3.1.Construction of diagnosis models

Based on the existing perceptron，including the back propagation neural network (BPNN)，ELM，and SVM，diagnosis models of the SRPS are constructed.In addition，DCs under various working conditions generated using the improved SRPS model are adopted to replace the field DC set and train the diagnosis models.To further improve the diagnosis accuracy，the feature extraction method of the DC is modified，and the parameters of the perceptron are optimized as well.The details are as follows.

3.1.1.Feature extraction based on relative normalized data

The purpose of feature extraction is to reduce the number of features as far as possible on the premise of retaining the workingcondition information，so as to reduce the dimension of the training input and improve the training and recognition efficiency of the diagnosis model.Currently，many features of the DC have been proven to be effective，such as FD，FCC，and CM.However，before extracting the features，the data of the DC should be normalized to avoid affecting the calculation due to the different dimensions between the load and the displacement.

Supposing each DC composes a set of discrete points{(si，Fi)}，the direct normalization formulae are as follows:

Fig.3 shows the direct normalized DC under several typical fault conditions.As shown in Fig.3 a，the shape of the DCs under ILS and SVL conditions can fully reflect the fault type;thus，the data directly normalized can be accurately classified.However，the direct normalized DCs under the conditions of SVF，TVF，and rod parting are too similar to be identified，as shown in Fig.3b.Similarly，it is also difficult to distinguish between normal DC and DC under the tubing leakage condition.This is because some faults not only affect the shape of the DC but also change the load range of the polished rod.

Therefore，to retain the quantitative characteristics of the DC，relative normalization is adopted to process the original DC data.The relative normalization formulae are as follows:

Fig.3.Improvement in feature extraction method.

The following formula is used to ensure that the relative normalized data are in the range of [0，1].

Based on the proposed relative normalization method，the relative normalized data of the DC under several typical fault conditions are obtained，as shown in Fig.3.As shown in Fig.3a，under SVL and ILS conditions，the shape information of the DCs is retained.Nevertheless，under SVF，TVF，and rod parting conditions，the positions of the relative normalized DCs are evidently different，as shown in Fig.3b，though with more similar shapes.As shown in Fig.3c，the relative normalization can preserve the fault characteristics of the maximum load line reduction under the tubing leakage condition，based on which this fault can be distinguished from that under the normal condition.To enhance the quantitative information of the load(ordinate)，the following features are added on the basis of the relative normalized DC features{Tr n}，as shown in Fig.3d.

where lcrepresents the total length of DC curve;larepresents the length of DC curve withvalue greater than 0.5;lbrepresents the length of DC curve withvalue greater than 0.25.

The features of FD，CM，and FCC improved by the relative normalization and additional features are denoted by IFD，ICM，and IFCC，respectively.

3.1.2.Perceptron training and discrimination of working condition

(1) Training data

To eliminate the difference in the distributions between training data (DCs under various system parameters of the field SRPS) and target data(DC of the SRPS to be diagnosed)，the DCs under various working conditions generated near the working point of the target SRPS are used as training data.

The training input is a matrix (n × Ns) composed of a featureparameter vector (n × 1) of the DC training set，and the output is a matrix (m × Ns) composed of a fault-indication vector (m × 1)，where n is the number of features，m is the number of working conditions，and Nsis the number of training samples.The working conditions are numbered on the basis of the following sequence:normal conditions，gas influence，gas lock，ILS，SVL，SVF，TVL，TVF，TPB，BPB，pump sticking，rod parting，tubing leakage，POB，heavy oil effect，and sand production.

(2) Discrimination of working condition

For the BPNN and ELM，the working condition of the SRPS can be identified by analyzing the distance between the diagnostic faultindication vector and the standard fault-indication vector.

where FTSrepresents the standard fault matrix composed of faultindication vectors of m types of working conditions;Im×mrepresents the m-order identity matrix;FDrepresents the code number of the diagnostic working condition;FTjrepresents the diagnostic fault-indication vector of the jth test well;FTSirepresents the standard fault-indication vector of the No.I working condition，i.e.，the ith column vector of Im×m.

For the SVM，the working condition of the SRPS can be identified by comparing the values of m hyperplane functions.

where Gi(·) represents the hyperplane function with the data of class i as positive samples and other data as negative samples;Tnjrepresents the DC features.

The diagnostic results are verified using Eq.(39).

where η represents the logical variable of correctness of diagnosis results;FArepresents the code number of actual working condition.

Thus，the diagnostic accuracy Dais as follows:

where Ntrepresents number of test samples.

3.1.3.Parameter optimization of perceptron

To avoid overfitting，the structure of the network (BPNN and ELM)should not be too complex.Thus，the number of hidden layers of the network is set to 1，and the number of neurons in the hidden layer is set to 30.Based on the selection criteria of the loss function and the characteristics of the label data and problems addressed in this paper，the cross entropy is selected as the loss function of the BP algorithm(Dong et al.，2020).Furthermore，because the SVM is a binary classifier，m classifiers are required to construct a diagnostic model.

Noticeably，the current perceptron learning algorithm has shortcomings that affect the diagnosis accuracy.An inappropriate initial value of the weight matrix will make the result of the BP algorithm be trapped in the local optimal region.For the ELM，although the least-squares (LS) algorithm can obtain the optimal weights and biases of the output layer，the input layer weights and variable biases of the hidden layer are given randomly，bringing significant uncertainty in the ELM result.The sequential minimum optimization (SMO) algorithm (Huang et al.，2015) can find the optimal classification hyperplane of the SVM;however，the type and parameters of the kernel function have no corresponding selection criteria.Therefore，to improve the adaptability of these perceptrons to the diagnosis problem，optimization is necessary.

Clearly，the objective of optimization is to minimize the error between the final training output and the expected output.Based on the discriminant formula (Eq.(37))，the optimization objective functions of the BPNN and ELM are defined as follows:

where Nsrepresents the number of test samples，FTArepresents the actual fault-indication vector.

Moreover，the optimization objective function of the SVM is defined as follows.

where Cjrepresents the working condition category of the jth training sample.

In this study，the genetic algorithm (GA) is adopted for the optimization (Li et al.，2018b;Zhang et al.，2020;Gokul and Sowmya，2019)，and the optimization process is shown in Fig.4，where iprepresents the iteration times of the BP algorithm;Iprepresents the maximum number of iterations of the BP algorithm;igrepresents the iteration times of the GA;Igrepresents the maximum number of iterations of the GA;LFrepresents the loss function;εLrepresents the permissible loss function of the BP algorithm;εfitrepresents the permissible fitness.Moreover，the initial values of the optimization parameters of the BPNN，ELM，and SVM are the weights and biases，input layer weights and hidden layer biases，and kernel function parameters，respectively.

3.2.Quantification of SRPS performance

According to Eq.(27)，the SRPS performance depends on system and fault parameters，where the system parameters are known for a given SRPS.The optimization inversion method is applied to identify the fault parameters(Lv et al.，2020b).The fault parameter X that minimizes the difference between the simulated DC generated by the improved model of the SRS and the measured DC is taken to characterize the actual SRPS.

3.2.1.Optimization model for identifying fault parameters

(1) Optimization objective function

For the convenience of setting the unified allowable error，the difference between the measured DC and simulated DC is defined as follows:

where S represents the stroke length of polished rod.

(2) Constraint condition

The constraint condition is the limitation of the fault parameter range.Additionally，the feasible region XDcan be reduced to a specific fault parameter space XDsby the diagnosis results obtained under the SRPS working conditions.In summary，the optimization model can be expressed as follows:

3.2.2.Solution method of the optimization model

The optimization objective function is the result of a complex model rather than an analytical formula.Thus，if the optimization search is carried out directly，the high number of iterations will dramatically increase the optimization time.Generally，the influence of fault degree of a single fault on the error is monotonous.Therefore，a method to obtain the approximate analytic expression of the objective function is proposed，and the specific steps are shown in Fig.5.First，the feasible region is discretized roughly，and the objective function value of the discrete point is calculated.Subsequently，the analytic expression of the objective function instead of the simulation model is obtained by fitting the discrete point data.

Fig.4.Optimization process of perceptron parameter based on genetic algorithm.

Fig.5.Calculation process of quantification of SRPS performance.

3.3.Framework of adaptive diagnosis method

Based on the above analysis，an adaptive diagnosis framework is proposed，as shown in Fig.6.The method is divided into online and offline parts.The task of the offline part is to generate sufficient DCs under various working conditions based on the system parameters of the target SRPS using the improved SRPS model for training the diagnosis model.In contrast，the task of the online part is to diagnose the working condition of the SRPS based on the measured DC and to quantify the system performance using the improved SRPS model and inversion algorithm.

Fig.6.Framework of adaptive diagnosis method.

4.Results and discussions

4.1.Data collection of actual wells

To verify the effectiveness of the method proposed in this paper，1104 DCs under various working conditions were collected from Shengli Oilfield in China.Among them，1024 DCs were used as the training set compared with the simulation data.The other 80 DCs were used as test samples for verification.The system parameters of the SRPS corresponding to the part test DCs are shown in Table 1，where SR represents the steel sucker rods，and CR presents carbon fiber sucker rods.

Table 1System parameters and production rates of actual SRPSs

4.2.Simulation results of improved model of SRPS

Based on the system parameters of 1# SRPS，shown in Table 1，and improved model of the SRPS，the polished rod DCs under 16 working conditions are acquired，as shown in Fig.7，where Sperepresents the effective stroke of the plunger.

Table 2Typical faults and corresponding characteristics of DC

In Fig.7a，the curve of the normal DC consists of four parts:loading，suction，unloading and discharge.Moreover，the suction and discharge curves fluctuate near the maximum and minimum static load lines (Fsmaxand Fsminlines)，respectively.The overall shape of the normal DC is like a parallelogram，and there are fluctuations in the upper and lower sides due to the vibration of the SRS，consistent with the actual process.

When both the valves are closed，the compression and expansion of the gas in the pump chamber slow down the change in the pump pressure，particularly in the unloading section，because of the high initial gas volume.Therefore，the loading and unloading lines of the DC affected by the gas are elongated，particularly the unloading line.Moreover，according to Eq.(12)，the change in the pump pressure decreases during loading whereas it increases in the unloading section，resulting in a convex loading line and a concave unloading line.As a result，the lower right corner of the DC under conditions of gas influence is missing，as shown in Fig.7b.In addition，the DC of the ILS is similar to the DC affected by the gas，because the two faults have the same action mechanism on the SRPS.However，the gas content in the pump chamber is low under ILS conditions;thus，the loading process is unaffected，whereas the unloading process can be divided into two stages.In the first stage，the gas in the pump chamber is compressed，and the pump pressure remains unchanged.In the second stage，the gas compression is low，and the pump pressure decreases rapidly.Therefore，as shown in Fig.7c，the bottom-right part of the DC under the ILS condition is a polyline.Although the effective stroke of the plunger will be reduced，the components of the SRPS will not be damaged under the conditions of ILS and gas influence.Hence，if the output and energy consumption are within the acceptable range，the SRPS can continue to work.However，under a gas lock condition，the huge amount of gas in the pump chamber will significantly prolong the loading and unloading processes so that there are no suction and discharge sections in the entire working cycle.Therefore，the DC of the gas lock shown in Fig.7c appears like a curved bar.

Fig.7.Simulated DCs of polished rod under various fault conditions.

Under the valve leakage conditions，in the loading section，the fluid leaking into the chamber through the TV will slow down the decrease in the pump pressure，i.e.，reduce the loading speed of the polished rod.With the decrease in the pump pressure，the leakage rate of the TV increases，leading to a lower loading speed of the polished rod.In the unloading section，the TV leakage will accelerate the increase in the pump pressure，and the acceleration effect gradually decreases.Therefore，the loading line of the TVL DC shown in Fig.7g is elongated whereas the unloading line is shortened，and both the lines are convex.In contrast to the TVL，the loading line of the SVL DC，shown in Fig.7e，is shortened whereas the unloading line is elongated.Compared with the normal DC，the upper left part of the TVL DC is missing，whereas the lower right part of the SVL DC is missing.Additionally，valve leakage not only reduces the effective displacement of the plunger but also reduces the liquid discharge during the effective stroke according to Eq.(18).Therefore，the production rate of the SRPS will significantly decrease under valve leakage conditions.

If the TV is damaged，the pump pressure will be maintained at the discharge pressure;under this condition，the polished rod load will remain near Fsmin.Therefore，the DC is a horizontal bar near the Fsminline shown in Fig.7h.However，the failure of the SV will cause the pump pressure to remain at the submergence pressure.Hence，as shown in Fig.7f，the DC of the SVF is a horizontal bar near the Fsmaxline.If the sucker rod string breaks，the polished rod load will be a floating weight of the broken sucker rod.Thus，as shown in Fig.7l，the DC of the rod parting is a horizontal bar below the Fsminline.Moreover，when the fault of pump sticking occurs，the movement of the lower end of the SRS is completely limited.The polished rod will bear the load due to the large-strain tension of the SRS.Consequently，the DC of pump sticking is a straight bar whose slope is the stiffness of the SRS shown in Fig.7k.Under these four conditions，the SRPS is no longer in production，and the oil pump will be or is seriously damaged.Thus，the SRPS should be overhauled immediately when these faults occur.

Under the pump bumping conditions，the SRPS operates normally except during the bumping;thus，the DC is consistent with the normal DC.However，the plunger load will change sharply during the bumping.Specifically，if the plunger hits the upper part of the barrel，the load of the polished rod will increase rapidly near the upper dead point;this results in a bugle in the upper right part of the DC，as shown in Fig.7i.Similarly，the lower left part of the DC under BPB conditions will bulge，as shown in Fig.7j.Pump bumping will not only reduce the effective stroke of the plunger but alsodamage the plunger and pump barrel.Therefore，once these faults are detected，the SRPS should be stopped immediately and then readjusted.

Fig.8.Comparison of diagnostic accuracy.

Fig.9.Comparison of improved feature diagnosis results.

Fig.10.Change in the fitness in the optimization process.

As shown in Fig.7m，the DC of the tubing leakage is similar to that under the normal condition.However，because of the leakage of the fluid in the tubing above the leakage position，the pump outlet pressure，i.e.，the discharge pressure，is reduced.Therefore，the suction curve of the DC is lower than the Fsmaxline，and no fluid is lifted to the surface.Moreover，as shown in Fig.7n，when the plunger moves upward and comes out of the pump barrel，the leakage clearance of the plunger will increase suddenly，leading to a rapid decrease in the polished rod load.As a result，the upper right part of the DC is missing，and the effective stroke of the plunger is reduced.

Fig.11.Accuracy results of optimized diagnostic model.

The higher viscous resistance of the heavy oil to the SRS will attenuate the resonance amplitude of the SRS rapidly and increase the stroke loss，causing the DC to become round and plump，as shown in Fig.7o.The shape of the DC indicates that the power of the polished rod increases while the liquid production rate decreases when heavy oil is recovered，implying an increase in the energy consumption.In addition，any sand in the pump barrel will add additional random loads to the plunger.These loads will propagate to the polished rod，resulting in significant burrs on the DC curve，as shown in Fig.7p.

Fig.12.Accuracy comparison between adaptive diagnosis and current diagnosis methods.

The above analysis results show that the DC corresponding to each working condition has unique characteristics.Hence，the working conditions of the SRPS can be easily identified by experienced engineers based on the DC.Comparing the characteristics of the measured DCs under the various working conditions summarized in Table 2 (Zhou et al.，2019) and the characteristics of the simulated DCs analyzed above，a good agreement is found.The comparison results show that the established model of the SRPS can effectively simulate the operation process of the SRPS under fault conditions.In addition，the DCs generated by the improved model of the SRPS can replace the DCs collected from the oilfield as training data for the diagnosis model.

Table 3Calculation results of quantification

4.3.Diagnosis results and discussion

4.3.1.Effect of diagnosis model type

Based on the system parameters of the SRPS corresponding to the 80 test DCs，1024 DCs (16 working conditions，64 DCs under each working condition)for each SRPS were generated to train the diagnostic model.The accuracy of the diagnosis models composed of three perceptrons (BPNN，ELM，and SVM) combined with six features(FD，CM，FCC，IFD，ICM，and IFCC) is shown in Fig.8.

As shown in the three sub-graphs of Fig.8，with the increase in the number of samples，the accuracy of the diagnostic models increases but with a decreasing rate.Among them，the BPNN has the greatest increase，because the increase in the number of training samples can quickly improve the generalization ability of the BPNN.However，the increase in the SVM is the lowest，because its diagnostic accuracy is only related to a small number of key samples(support vectors).

The ELM and BPNN，with Fourier descriptors as features，have the highest accuracy，because the Fourier inversion results are continuous characteristics of the DC;moreover，they have superior fitting ability for continuous parameters.The SVM with the FCC as the feature has the highest diagnostic accuracy，because the FCC is a discrete and standardized feature (an eight-direction FCC has only eight integer-level values)，and the SVM has better classification ability for these types of data.

Moreover，with improved features，the accuracy of the diagnosis model constructed by all the perceptrons is improved.The average improvement in the diagnosis accuracy (1024 training samples)of the nine diagnostic models was 10.2%，of which the highest was 18.8%(SVM-FCC)and the lowest was 4.8%(BPNN-FCC).Eventually，with the improved features，the highest diagnostic accuracy of the BPNN，ELM，and SVM could reach 82.5%，85%，and 90%，respectively.

Taking the diagnostic model of the IFCC-SVM with the highest accuracy as an example，the specific diagnosis results of the 80 test samples are shown in Fig.9.With the original feature (FCC)，23 samples were diagnosed incorrectly.Four of them are under sand production (No.16) conditions，because the FCC is obtained from the contour of the discrete points on the DC，and it could not fully include the features of the high-frequency load.In addition，13 of them are under conditions of SVF (No.6)，TVF (No.8)，and rod parting (No.12)，because the DCs under these working conditions are horizontal bars with different positions，and a direct normalization eliminates the position information.Nevertheless，the improved features retain the quantitative information of the position，and the relative normalization weakens the influence of the pulse load on the discrete point contours of the DC.Therefore，the diagnosis accuracy of the samples under the conditions of SVF，TVF，and rod parting is 100%，and the number of diagnosis error samples under sand production conditions is reduced by 2.Referring to the samples of the diagnostic errors under the other conditions，the main reason is that the fault of the SRPS is minor and has less effect on the features of the DC.

Fig.13.Difference function of DC.

4.3.2.Effect of optimization

Taking 21# SRPS as an example，the optimization results of the three perceptrons with the IFD as input are shown in Fig.10.The fitness of the BPNN and ELM is reduced by nearly 10 times，and the fitness of the SVM is also reduced by 50%，which shows that the optimization can improve the recognition accuracy of the perceptrons.

Furthermore，the accuracy of the diagnosis model after parameter optimization for actual SRPSs is shown in Fig.11.

As shown in Fig.11，the accuracy of the optimized diagnostic models is increased by 3.7%-16.3%，with the highest increase obtained by the BPNN-IFCC and the lowest by the SVM-IFD.Moreover，the BPNN has the maximum average increase in the diagnostic accuracy(13.8%)，because the optimization could help the BPNN to obtain the global optimal solution，and the optimization of all the weights and biases of the neural network could improve the fitting ability of the BPNN to the greatest extent.The ELM comes second，because the error of the ELM is feed-forward，and the optimization results of the input weights and biases of the hidden layers are not necessarily the global optimal solution.The SVM could obtain the optimal plane and support vectors according to the KKT condition;only the parameters and relaxation factors of the kernel function can be optimized.Thus，the average increase in the SVM diagnostic accuracy is lowest (5%).More importantly，after optimization，the highest diagnostic conditions of the BPNN，ELM，and SVM are 96.3%(BPNN-IFD)，95% (ELM-IFD)，and 95%(SVM-IFCC)，respectively.

4.3.3.Accuracy comparison between current method and adaptive method

Fig.12 shows the diagnostic accuracy comparison between the current diagnosis method based on the DC set collected from the oilfield and the adaptive diagnosis method proposed in this paper based on the simulated DC data.Here，all the nine diagnostic models are composed of improved feature extraction methods and optimized perceptrons.

As shown in Fig.12，under the 16 working conditions mentioned above，the average accuracy of the adaptive diagnosis method is 15.6% higher than that of the current method.In addition，the highest accuracy of the current diagnosis method is 82.5%，which is 13.8%lower than that of the adaptive diagnosis method.The above results show that the simulated data of the DC can replace the DC collected from the field as the training data for the diagnosis model，and the adaptive diagnosis method is superior to the current method in terms of the diagnostic accuracy.

4.3.4.Quantitative results of SRPS performance

The faults，including the gas influence，ILS，SVL，and TVL，only slightly affect the production efficiency and will not damage the machinery.Therefore，the index to judge whether the SRPS could continue to work is whether the production rate is acceptable under these fault conditions.Thus，it is necessary to obtain the quantitative performance of the SRPS，particularly the liquid production rate，from the measured DC.

Taking 6# SRPS as an example，the difference function (f) between the simulated DC and the measured DC is shown in Fig.13.The working condition of the SRPS is gas influence，so the main fault parameter is R，and the corresponding element in the fault parameter(X)is Rd.As shown in Fig.13，with the increase in Rd，the value of the difference function first drops and then rises，and there is only one minimum point on the curve.The abscissa (Rdo) of the minimum point is the key fault parameter that characterizes the fault degree of the 6# SRPS.As shown in the sub-graph of Fig.13，when Rdis equal to Rdo，the simulated DC has a good consistency with the measured DC.

Based on the method presented in Section 3.2，the quantitative performances of 25 test SRPSs under normal or minor fault conditions(gas influence，ILS，SVL，and TVL)were obtained as shown in Table 3.Table 3 gives the average error of the liquid production rate(Ql) and the average calculation time under each working condition.

The highest error of the liquid production rate is 6.6%，and the average is 4.8%，which meets the requirements of industrial output metering for the SRPS.Meanwhile，it also proves the validity of the improved model of SRPS model.The experiment was performed on an industrial computer equipped with Intel Core i5-4460 and a 64-bit operating system.The code of the quantitative analysis method and these diagnosis models were run on MATLAB 2017b.The diagnostic procedures required less than 1 s to complete，whereas the computational time for the quantitative analysis was longer with the highest value of 90.8 s.Nevertheless，the computational time was no more than 10 times of the operation cycle.Hence，the method proposed in this paper could realize a timely highprecision diagnosis of the working condition and output metering for the SRPS.

5.Conclusions

To improve the diagnostic accuracy for faults，an adaptive diagnosis method for a rod pump system was developed.First，an improved model of the sucker rod dynamics was derived，with which DCs under 16 working conditions could be generated.Subsequently，taking a set of simulated fault DCs generated near the working point of the target SRPS as training data，a novel framework based on an improved feature extraction method and an optimized perceptron was constructed to diagnose the down-hole working conditions.Moreover，a quantitative method for the SRPS performance，particularly the liquid production rate，was presented on the basis of the parameter identification of the improved SRPS model.The accuracy of the adaptive diagnosis method and quantization method was demonstrated through field data collected from 80 oil wells.Compared with the current diagnosis methods，the accuracy of the adaptive diagnosis method could be improved by 13.8%，because the distributions of the simulated training data and target data were highly consistent.Finally，the real-time performance of the proposed method was verified through computational efficiency experiments.

Acknowledgments

This work was support by the Major Scientific and Technological Projects of CNPC under Grant no.ZD2019-184-004;the National Research Council of Science and Technology Major Project under Grant no.2016ZX05 042004;the Fundamental Research Funds for the Central University under Grant no.20CX02307A;and the Opening Fund of National Engineering Laboratory of Offshore Geophysical and Exploration Equipment under Grant no.20CX02307A.The authors also express their great gratitude to Petroleum Engineering Research Institute of Sinopec Shengli Oilfield Company for providing the data of field well.