APP下载

AR Model Based on Time Series Modeling for Predicting Egg Market Price in 2021

2021-08-02MinYAOQingmengLONGDiZHOUJunLIPingLIYingSHIYanWANG

农业生物技术(英文版) 2021年3期

Min YAO Qingmeng LONG Di ZHOU Jun LI Ping LI Ying SHI Yan WANG

Abstract Eggs, as a meat consumer product in China, are closely related to the vegetable basket project. Exploring and predicting the future trend of egg market price is of great significance for stabilizing egg price and market supply. In this study, the time series AR model was used for fitting the egg market prices in the 66 d from January 1 to March 7, 2021, and the delay operator nlag18 was used for white noise test, giving pr>probability of chisq<0.005. The time series was not a white noise series, and then the stationary series was used for modeling. The optimal model was selected as the AR series (BIC(3,0)), and finally, the egg market price model AM was obtained as Xt=9.055 6+(1+0.892 6)εt, which was the optimal model. The model showed that the egg price fluctuations in 2021 will be clustered, and the later price will be significantly affected by external factors in the previous period. The dynamic prediction results of the model showed that the egg price would stop falling in March 2020, and the egg price would continue to slow down in March.

Key words Time series; Autocorrelation coefficient; Partial correlation coefficient; AR model; Egg market price

Received: January 23, 2021  Accepted: March 29, 2021

Supported by Construction of Guizhou breeding livestock and poultry genetic resources testing platform [QKZYD(2018) 4015]; Science and Technology Innovation Talent Team of Guizhou Provinces Major Livestock and Poultry Genome Big Data Analysis and Application Research(QKHPTRC[2019]5615); Guizhou Provincial Poultry Industry Joint Research Project.

Min YAO (1986-), female, P. R. China, engineer, master, devoted to research about food engineering.

*Corresponding author. E-mail: 920522402@qq.com.

The report of the 19th National Congress of the Communist Party of China proposed the implementation of the strategy of rural revitalization. Subsequently, the Opinions of the Central Committee of the Communist Party of China and the State Council on Implementing the Strategy of Rural Revitalization was issued, and the Opinions of the Ministry of Agriculture and Rural Affairs on Vigorously Implementing the Strategy of Rural Revitalization and Accelerating the Promotion of Agricultural Transformation and Upgrading was also announced. All these indicate that agriculture should "adhere to the market orientation, focus on adjusting and optimizing the agricultural structure, achieve a new balance between agricultural supply and demand at a higher level, accelerate the construction of agricultural informatization, strengthen the application of agricultural big data, and build an agricultural and rural data resource system". As a large agricultural country, China has a wide variety of agricultural products and a large demand, and price prediction research will have immeasurable potential value.

The time series X(t), t=0,1,…, N-1 is the external manifestation of the evolution process of the system, that is, the objective measurement of the system. In reality, time series are all non-stationary, and their changes are affected by many factors, some of which play a long-term and decisive role, making the changes of the time series show a certain trend and a certain regularity, and some play a short-term, non-deterministic role, making the changes in the time series show a certain degree of irregularity. The basic idea of the time series forecasting method[1-3] is to predict the future changes of a phenomenon, and use the past behavior of the phenomenon to predict the future, that is to say, the historical data of the time series reveals the law of the phenomenon that changes over time, and extends this law to the future, so as to make predictions about the future of the phenomenon.

The time series forecasting problem can be regarded as a functional regression problem. Because time series forecasting uses past and current observations to estimate future values, it is actually based on an assumption that there is a certain functional relationship between future values and past values. The purpose of forecasting is to try to construct this function[4-5]. After introducing the phase space reconstruction theory[6], by reconstructing the single variable of the nonlinear time series through the phase space, the input mode is directly related to the nonlinear system of the time series itself. In this way, as long as the historical data of the time series itself is input, the future can be predicted by constructing the non-stationary time series AR model.

Introduction of stationary series of time series

For any t, the covariance function Y(Xt, Xt+h)=Y(X0, Xh) is derived from autocovariance. Stationary series require that the autocovariance is not affected by time and is only related to time difference, so the autocorrelation graph of a stationary time series will increase with the order, and the autocorrelation function will drop to near 0, otherwise it may be a non-stationary time series[7].

An autocorrelation graph is a plane two-dimensional coordinate dangling line graph. The abscissa represents the delay order, and the ordinate represents the autocorrelation coefficient. The partial autocorrelation coefficient PACF(k) formula was as follows:

PACF (k)=E(Zt-EZt)(Zt-k-EZt-k)E(Zt-EZt)E(Zt-k-EZt-k)2

=cov(Zt-EZt)(Zt-k-Z-t-k)var(Zt-EZt)2var(Zt-k-Z-t-k)2(1)

The abscissa of the partial autocorrelation graph represents the delay order, and the ordinate represents the partial autocorrelation coefficient.

When solving the lag-k autocorrelation coefficient p(k), the correlation obtained is actually not a pure correlation between x(t) and x(t-k). Because x(t) will also be affected by the middle k-1 random variables x(t-1), x(t-2),…, x(t-k+1), while these k-1 random variables all have certain correlation with x(t-k), the autocorrelation coefficient p(k) is actually mixed with the effects of other variables on x(t) and x(t-k). In order to simply measure the effect of x(t-k) on x(t), the concept of partial autocorrelation coefficient is introduced. For stationary time series {x(t)}, the so-called lag-k partial autocorrelation coefficient refers to the degree of correlation of x(t-k) with the effect on x(t), under the condition of given middle k-1 random variables x(t-1), x(t-2),…, x(t-k+1), or in other words, after eliminating the interference of the middle k-1 random variables x(t-1), x(t-2),…, x(t-k+1).

Then, the p-order autoregressive process AR(p) is defined as:

Xn=a0+a1Xn-1+a2Xn-2+a3Xn-3+…apXn-p+ε(2)

In the formula, ε is the white noise series, Var (εn)=δ2.

AR models are generally used for stationary series, but the series are not necessarily stationary, so a stationarity test must be performed before using AR models.

Time series modeling flowchart

Modeling

Data source

The data came from Zhengzhou Commodity Exchange and Dalian Commodity Exchange on the website of the Ministry of Agriculture and Rural Affairs of the Peoples Republic of China (http://zdscxx.moa.gov.cn). Specific data are shown in Table 1.

Min YAO et al. AR Model Based on Time Series Modeling for Predicting Egg Market Price in 2021

Stationary non-white noise test

Before time series analysis, it is necessary to judge whether the variables are stable, which is the effective prerequisite for model estimation. If the variable is not stationary, the series should be logarithmically split to make it stationary usually. In this study, the ADF stationarity test was adopted to test the variable, egg price. From the autocorrelation graph in Fig. 1, it quickly attenuates to 0 at the second order, and the partial correlation coefficient in Fig. 2 quickly attenuates to 0 at the third order. It can be seen that the series was a time stationary series. Then, from the P value<0.001, it can be seen that the egg price series was not a white-noise series, and various series had a correlation relationship therebetween, that is to say, the series could be modeled.

Model recognition

It was analyzed above that the price of eggs in China in 2021 was a stable non-white noise series, which was suitable for analysis by the ARIMA model. Therefore, an ARIMA analysis was first conducted on the price of eggs. According to the ARIMA model ordering ideas from general to simple, the model was reduced by removing one variable with the least significant coefficient each time, and the optimal model could be then determined by comparing AIC and considering other statistics. The result was output based on the ARIMA model with the best egg price in China, as shown in Fig. 4. From AR(5), BIC(3,0), p=3, q=0, that is, the third-order attenuation tailing of the autocorrelation coefficient (p=3), and the first-order truncation of the partial correlation coefficient (q=0), the AR model was selected.

Fitting model

Xt=σ+(1-Q1B-Q2B2-…QqBq)ε1=σ+Φ(B)εt

Through the related timing diagram and statistical test, it can be seen that the residuals were white noises, the estimated value σ was 9.055 661, and Φ(B) was (1+0.892 62), as shown in Fig. 5, so the model equation was:

Xt=9.055 661+(1+0.892 62)εt.

The value of Pr>Chisq was less than 0.000 1, indicating that the model fitting was effective. There was no autocorrelation in the residual series, and the residual test was passed.

Significance test of the model

Fig. 6 show that the P values of the two parameters σ and Φ(B) in the t test for were less than 0.000 1, which meant that the test of the two parameters σ and Φ(B) was significant, so each unknown parameter was significantly non-zero. The result of the model was already the most simplified, and there was no need to delete insignificant parameters.

Prediction of the egg market price trend in 60 consecutive days after March 7, 2021 using the model

Fig. 7 and Fig. 8 show the prediction data of the 60 d from the 67th to the 96th series following the original data of the 66 days and their 95% confidence intervals. The black line chart in Fig. 7 shows the 1-66 known prices of eggs, the red line chart is the predicted prices of the 60 d, and the green line charts are the 95% confidence intervals of the forecasted values. It can be seen from Xt=9.055 661+ (1+0.892 62)εt, the egg price fluctuations in 2021 will be clustered, and the later price will be significantly affected by external factors in the previous period. The dynamic prediction results of the model showed that the egg price would stop falling in March 2020, and the egg price would continue to rise slowly in March and be stabilized at 9.055 7 yuan/kg, which is consistent with the data trend predicted in Fig. 7.

References

[1] LIU H, MI XW, LI YF. Smart multi-step deep learning model for wind speed forecasting based on variational mode decomposition, singular spectru analysis, LSTM network and ELM[J]. Energy Conversion & Management, 2018, 159(1): 54-64.

[2] CUI GZ, ZHANG ZX, YANG LZ, et al. Improved wavelet threshold denoising algorithm[J]. Modern Electronics Technique, 2019, 42(19): 50-53. (in Chinese)

[3] WANG GH, YANG Y, WU HR, et al. Intelligent analysis of produces market price based on web[J]. Journal of Shenyang Aricultural University, 2013(3): 284-288. (in Chinese)

[4] ZHANG MG. The relevant analysis of price in agricultural product regulation——Cases of price of live pig and corn[J]. East China Economic Management, 2013(5): 173-176. (in Chinese)

[5] YAN Z. The forecast of agricultural commodity price trends based on association analysis model[J]. Journal of Zhejiang Business Technology Institute, 2019, 19(1): 7-11. (in Chinese)

[6] KAMILARIS A, KARTAKOULLIS A, PRENAFETA-BOLD F. A review on the practice of big data analysis in agriculture[J]. Computers & Electronics in Agriculture, 2017(143): 23-37.

[7] IP RHL, ANG LM, SENG KP, et al. Big data and machine learning for crop protection[J]. Computers &Electronics in Agriculture, 2018(151): 376-383.