Ionospheric vertical total electron content prediction model in low-latitude regions based on long short-term memory neural network

2022-08-31TongBaoZhang张同宝HuiJianLiang梁慧剑ShiGuangWang王时光andChenGuangOuyang欧阳晨光

Chinese Physics B 2022年8期

Tong-Bao Zhang(张同宝) Hui-Jian Liang(梁慧剑)Shi-Guang Wang(王时光) and Chen-Guang Ouyang(欧阳晨光)

1Department of Precision Instrument,Tsinghua University,Beijing 100084,China

2State Key Laboratory of Precision Measurement Technology and Instrument,Tsinghua University,Beijing 100084,China

3Department of Electronic Engineer,Tsinghua University,Beijing 100084,China

Keywords: long-short-term memory neural network, equatorial ionosphere, vertical total electron content,vertical total electron content(vTEC)

1. Introduction

Ionospheric phase delay is one of the main sources of noise affecting global navigation satellite system (GNSS)signals.[1–4]For instance, when demonstrating carrier-phase two-way satellite time and frequency transfer(TWCP)over a very long baseline, the ionosphere may introduce a time delay of the order of 100 ps.[5]For communication and navigation systems,single-frequency users may receive a positioning error of meters to ten meters. Other than GNSS, the operation of radio detection and ranging systems[6–9]and very-longbaseline-interferometry (VLBI)[10]also require ionospheric information to reduce the impact of the ionosphere.

Modeling the ionosphere total electron content is one of the most important and common methods to obtain information on ionospheric activity. On the one hand, through empirical physics models (such as the Klobuchar, International Reference Ionosphere 2016, NeQuick models) and global/regional total electron content maps (such as Global Ionosphere Map of the Center for Orbit Determination in Europe, GIM CODE),[11]one can calculate and compensate for ionospheric delay in real time. On the other hand, users can obtain future ionospheric information through short-and longterm prediction models (such as machine learning and deep learning models)to understand and mitigate effects of drastic space weather.

At mid-to-high latitudes, the establishment of prediction models are relatively simple because of stable ionospheric changes. However, at low latitudes, ionospheric activity is highly irregular, leading to extreme differences between the prediction models and the actual state of ionosphere. Thus,new prediction models suitable for low latitude regions are required. With the continuous development of deep learning algorithms, more attention is being paid to the use of neural network (NN) models to obtain the past characteristics of the ionosphere under different space conditions and predict its future changes. More specifically, if an ionosphere map can be regarded as a generalized Markov process, which means at any given timet, the ionospheric state at timet+1 depends only on its state at or immediately beforet, recurrent neural networks (RNN) can be used to predict its states.[12]However,to predict data based on long-term past characteristics, RNN is limited by the difficulty of vanishing/exploding gradients as the number of layers increase. Long short-term memory (LSTM) NN can effectively overcome these limitations of RNN and, in recent years, many researchers have turned to LSTM and related models to predict ionospheric parameters, especially vertical total electron content (vTEC).Sunet al.(2017)applied a bi-LSTM model to predict,at any given time, the TEC change expected in the following 24-hour period in Beijing, the root mean square error (RMSE)of their methods were better than those of the multi-inputsingle-output LSTM,LSTM,and MLP models.[13]Gruetet al.(2018)used an LSTM model to get single-point predictions of disturbance storm time index (Dst) on both stormy and normal days and improved the prediction accuracy of their model with a Gaussian process.[14]Srivaniet al.(2019) compared the prediction results of LSTM, NN, and International Reference Ionosphere 2016 (IRI-2016) models and proved that LSTM outperforms the IRI-2016 and NN models at low latitude(Bengaluru IGS station,India).[15]Sent¨urk Eet al.(2020)compared autoregressive integrated moving average(ARIMA)and, LSTM models under different space conditions (including quiet space weather and geomagnetic storm),proving that LSTM predicted better at low and middle latitudes,[16]while Tanget al.(2020)studied and compared the prediction results of the ARIMA, LSTM, and Seq2Seq models, demonstrating that LSTM had better prediction results during geomagnetic storms.[17]Ruwaliet al.(2020)merged the LSTM setup with a convolutional neural network(CNN),and determined that the combined LSTM-CNN model was better at forecasting ionospheric delays for GPS signals than were the NN and gated recurrent unit models.[18]Liuet al.(2020) developed a deep learning model that utilized LSTM to forecast, at any given point in time, the spherical harmonics (SH) coefficients for the following two hours. The predicted TEC map was in good agreement with CODE TEC products and got smaller RMSE than did the IRI-2016 and NeQuick-2 models.[19]Raoet al.(2021)predicted the values of peak frequencies of the F2 layer(foF2s) and peak heights of the F2 layer (hmF2) using the bi-LSTM model,whose performance was demonstrated to be better than those of the LSTM,NN,and IRI-2016 models.[20]Wenet al.(2021) compared the LSTM, IRI-2016, and back propagation models at moderate latitudes (BJFS IGS station,China),where they proved the LSTM model to be better than the other two. Furthermore,they studied the TEC predictions of different stations’ spread across locations from 80◦S to 80◦N and found the accuracy of ionosphere models in low latitudes to generally be low.[21]Xionget al.(2021) utilized an encoder-decoder LSTM extended(ED-LSTME)neural network with a sliding window to predict the TEC of 15 stations in China. Their system had an RMSE of 12.09 TECU and could be used under diverse conditions.[22]Salaret al.(2021)applied LSTM to predict upcoming storms based on the historic data and on the ionospheric TEC,[23]while Moonet al.(2020) conducted a study using the LSTM model to perform 24-hour predictions.They trained long-term data from the Jeju ionosonde (Jeju station, Korea), to predict data corresponding to the following 24 hours, making better predictions of foF2s and hmF2s values than existing models.[24]Kimet al.(2020)predicted 24-hour values of hmF2 and NmF2 by adopting LSTM in combination with the SAMI2 model, with an accuracy that was approximately 45% higher than those of the original SAMI2 and IRI-2016 models.[25]Besides,Kimet al.(2021) optimized the LSTM model based on the work of Moonet al.(2020)to make 24-hour predictions whose performance during geomagnetic storms, in terms of the RMSE of foF2(hmF2)values,were better than those of the quiet LSTM,SAMI2, and IRI-2016 models.[26]These studies all showed that LSTM NN performs better than other methods,especially in low latitudes.

However, there still remains a lack of LSTM models for the prediction of relatively long-term(at least 24 hours in advance)vTEC values in low latitudes. According to Ruwaliet al.(2020), data predicted more than 5 hours ahead in the low latitudes would cause the RMSE of LSTM models to increase rapidly and exceed 4.8 TECU,[18]implying the unreliability of long-term predictions of ionospheric TEC in low latitudes based on the normal LSTM model alone. To overcome the declining accuracy of the LSTM models at low latitudes and enable their prediction for longer periods of time,we implemented a multi-input-multi-output LSTM(multi-LSTM)model,and trained it with different time-series data. We used this to predict vTEC levels at least 24 hours ahead of time over the low-latitude Peking University Shenzhen Graduate School station(geog.lat.22.6◦N,long.113.97◦E),and employed a correction coefficient to modify this model. We have also discussed the RMSE values of the 24- and 48-hour predictions,and, after modifying, compared the prediction results of the multi-LSTM 24-hour model, modified multi-LSTM 24-hour model,and the IRI-2016 model.

In this article,sections are organized into the introduction,a description of the model architecture and dataset,discussion of the prediction results and model modification,followed by a conclusion segment.

2. Model architecture and dataset

As an improvement on RNN, LSTM adds a gate mechanism to its framework, which enables it to reserve information from historical data and overcome the difficult posed by vanishing/exploding gradients. A general LSTM framework is shown in Fig.1. It primarily contains a series of connected neuron units. Memory units, forget gate vectorft, input gate vectorit, and output gate vectorotexist in each neuron unit and, are used to cooperate with point-wise operations(multiplication and addition) and activation functions (usually sigmoid and tanh)to update the cell state(Ct)and output(ℎt)of the current unit.

Fig.1. Structure of LSTM neural network.

In the process of model calculation, all three gates for a given neuron unittare calculated from the input (xt) and previous outputℎt−1through the activation function sigmoid.The forget gateftdetermines the features in the cell stateCt−1that can be used to calculate stateCt. For each vector element located in the range of[0,1],0 indicates that a specific feature needs to be forgotten, while 1 means that a specific feature needs to be left. Input gateitdetermines the quantity of input retained in the current cell state, to avoid the retention of useless values, while the output gateotdetermines whether the input of the forget gateftaffects the outputℎt. All the three gates are calculated in the same way. The outputℎtand stateCtwill be transmitted to the next neuron unit(designed ast+1),whileℎtwill also be simultaneously used to generate the predictions of stept. We enumerate the formulas for the cell stateCtand outputℎtas follows:

Taking 1 hour as the time resolution of the LSTM model,we definetk,n(k=0,1,2,...,23 andn=1,2,...)as the UTC time atko’clock of dayn, vTECtk,nas the detected value of actual vTEC attk,n,and vTECt′k,nas the predicted vTEC value attk,n. Obtaining vTECt′k+1,ndepends on the detected values of historical vTEC

In other words, this normal LSTM model predicts ionospheric behavior attk+1,nby using the long-term characteristics (such as diurnal, seasonal, solar cycle, and storm characteristics) of array (3). Diurnal and seasonal characteristics present periodicities of one day and one year due to the Earth’s rotation on its axis and revolution around the sun, while one solar cycle is about eleven years,which also introduces quasiperiodic changes to the ionosphere. In the case of limited data set size, eliminating diurnal characteristic is the most effective method to preserve inner streams and local perturbations required by the model. According to this hypothesis, we can transform one process into multiple independent random processes that, at the very least, satisfy the generalized Markov properties and,represent array(3)as matrix(4)as follows:

Each row vector of matrix(4)will be the input data for the training of the multi-LSTM model, which means that we can get at least 24 predicted values from UTC 00:00 to 23:00 of the following day with hour-long time resolution. Figure 2 shows the difference between this multi-LSTM model and LSTM model with normal time series input.

Our experimental data was obtained from the Peking University Shenzhen Graduate School station (geog. lat. 22.6◦N,long. 113.97◦E, from https://data2.meridianproject.ac.cn/),Shenzhen,China. Every half hour,this station outputs a vTEC file,containing vTEC data from up to 11 visible satellites,with a time resolution of 1 min. Choosing the median value of the observations from all the satellites as the actual vTEC value,performing an exponential moving average at every 30 min to weaken the influence of ionospheric scintillation, and extracting one value each hour from UTC 00:00 to 23:00, we obtained a smoother ionospheric vTEC data,with a time resolution of 1 hour.

Figure 3 depicts the fluctuation of ionospheric vTEC during 2012 to 2020 (nearly one solar cycle). Since this article was written in 2021,the vTEC data throughout 2020 have been made available in Fig.3.Due to variations in the zero point selected,the absolute detected vTEC values of this station were higher than those calculated from other methods,however,relative change is consistent amongst them.

Ionospheric activity was relatively low throughout 2019 and 2020 (Fig. 3(a)), indicating that these two years were in the descending phase of the solar cycle. Without considering modification and the sliding window, we choose vTEC data from May 1,2019,to June 30,2020,as the initial training set and the data of July 2020 as testing set(Fig.3(b)),that is,the multi-LSTM model was trained within the descending phase of the ongoing solar cycle.

Fig.2. Difference between multi-LSTM model and LSTM model with normal time series input.

Fig.3. Fluctuation of ionospheric vTEC from 2012 to 2020. (a)Fluctuation from 2012 to 2020. (b)Fluctuation from 2019 to 2020(brown line represents initial training set,from May 1,2019,to June 30,2020,the green line represents testing set,from July 1,2020,to July 31,2020).

Fig.4. Structure of multi-LSTM model with sliding window.

Besides, a sliding window with a stride of one/two-time step(s) was added to the multi-LSTM model. A sliding window is often used to assess the stability and accuracy of a specific model in the financial industry.Through a sliding window, prediction models can capture changing characteristics in a historical time series,and further evaluate the adequacy of models in terms of statistical properties and forecast error.[27]In our study,a sliding window is used in an actual forecasting model and, not merely limited to testing a statistical model.Figure 4 shows structure of the multi-LSTM model with sliding window. In the future,we will compare prediction accuracies on changing the length of the sliding window from shortterm (for example, one year) to long-term (for example, one solar cycle) values, in this letter, however, we have fixed the length of the initial training set as that of the sliding window(from May 1, 2019 to June 30, 2020). Besides, in the forecasting process, each time a window slides, the multi-LSTM model outputs the predicted results for next 24 and 48 hours.

3. Results and discussion

3.1. Prediction results of multi-LSTM model

Figure 5 shows the prediction results made 24 and 48 hours ahead for all days in July,2020. The red and blue lines represent the 24- and 48-hour results, respectively. It can be seen that these two predictions generally follow the trend of actual detected data. Figure 6 shows the RMSE of the differences between detected and predicted data of each day in July 2020 and it is evident from this that the 24-hour prediction performs better than the 48-hour prediction. Except July 12, 13, and 14, for 48-hour predictions, there are four days’RMSE values exceed 7 TECU(July 1,10,14,and 18),while 24-hour predictions have only one day (July 1), and, for 48-hour predictions, only 11 days’ RMSE values are less than 24-hour predictions, among the other 20 days, 17 comparable days’ RMSE values are all greater than 24-hour predictions. Furthermore, the mean RMSE value of 24-hour predictions (5.1 TECU) is slightly less than the 48-hour predictions (5.3 TECU). These are consistent with those drawn by Ruwaliet al.(2020).[18]

Fig.5. Predictions for 24(red line)and 48(blue line)hours ahead and detected data(black line)in July 2020: (a)from July 1 2020 to July 5 2020;(b)from July 6 2020 to July 10 2020;(c)from July 10 2020 to July 15 2020;(d)from July 16 2020 to July 20 2020;(e)from July 21 2020 to July 25 2020;(f)from July 26 2020 to July 31 2020.

Fig.6. RMSE values of the differences between detected and predicted values(multi-LSTM model only),red solid line represents 24-hour results,blue solid line represents 48-hour results, red dotted line represents mean value of 24-hour results(5.1 TECU),blue dotted line represents mean value of 48-hour results(5.3 TECU).

Fig.7. The impact of data loss on model prediction accuracy,May 31,2020.

In Fig.5(c),cuts exist in the detection and prediction data from July 12 to 14.This situation arose because on July 12,the dual-frequency receiver did not output complete ionospheric data(cut of black line in Fig.5(c)). Therefore,neither the 24-nor 48-hour predictions on July 12 are comparable(in Fig.6,48-hour RMSE of July 12 is calculated by the first 9 hours but not the whole day). In addition, there are no 24-hour results on July 13 without data from the previous day(cut of red line on July 13 in Fig. 5(c)), or 48-hour results on July 14 without sufficient data from two previous days(cut of blue line on July 14 in Fig. 5(c)). During the model training process, the lack of ionospheric data within a few days is ignored, as we assume that from a long-term perspective, for the data set at each time node of multi-LSTM model,the lack of data within a few days corresponds only to a few points, does not affect the judgment of the long-term trend of the ionospheric vTEC.Taking May 31 as an example,figure 7 shows the differences between the 24-hour predicted values and the detected values of May 31 of 2020 when different days are missing before May 30, and the calculated RMSE is also placed. When the missing data is about 20 days,there will be a significant increase in the error of the predicted results during daytime,as the RMSE reaches 5.31 TECU.When the missing data is within 10 days,the model output does not introduce large errors.In the test set,from July 15,the model normally output prediction results.

3.2. Modified multi-LSTM model

In the test set, the data output by the model cannot have corresponding detected values for comparison and error correction like the training set. We can, however, find common factors that affect model accuracy through a retrospective analysis of historical data.

By defining the data from June 2020 as the validation set and using the multi-LSTM model to predict 24-hour data for the same month, we obtained the RMSE values shown in Fig. 8. Data from the dates of June 7, 9, 10, and 20 deviated more noticeably from predicted values than did data from other dates, which explains the large differences between the predicted values and the detected values.

Fig. 8. RMSE values of the differences between detected and predicted results, June 2020. June 7, 9, 10, and 20 deviated more noticeably from predicted values,all of their RMSE values exceed 7 TECU.

Detected values of vTEC data on the dates of June 7, 9,10, and 20, values from preceding dates, and the corresponding predictions made are shown separately in Fig.9. In order to understand more clearly the causes of vTEC fluctuations on these days,the corresponding F10.7,Dst and Kp×10 indexes are also shown in Fig. 10. From Figs. 10(d)–10(g), in order to make it easier to observe, we have combined the Dst and Kp×10 indexes.

According to Fig.10(a),the F10.7 index is relatively stable for the whole month,indicating that solar extreme ultraviolet radiation has a limited impact on the ionosphere. In addition,according to Figs.10(b)and 10(c),the overall analysis of the Kp index and Dst index shows that there have been no geomagnetic storms this month, and only a few geomagnetic field disturbances have occurred.

Fig.9. Detected and predicted results(24 hours ahead)of June 7,9,10,20 and corresponding previous dates;(a)detected data of June 6,June 7(black line)and predicted data of June 7(red line);(b)detected data of June 8,June 9(black line)and predicted data of June 9(red line);(c)detected data of June 9,June 10(black line)and predicted data of June 10(red line);(d)detected data of June 19,June 20(black line)and predicted data of June 20(red line).

Fig. 10. F10.7, Dst and Kp×10 indexes on June 2020. (a) F10.7 index in June 2020; (b) Dst in June 2020; (c) Kp×10 index in June 2020; (d) Dst and Kp×10 index from June 6 to 7;(e)Dst and Kp×10 index from June 8 to 9;(f)Dst and Kp×10 index from June 9 to 10;(g)Dst and Kp×10 index from June 19 to 20.

Fig.11. The 24 histograms of row vectors of activity matrix(6)from May 1,2019 to May 31,2020. (a)–(x)represent UTC 00:00 to 23:00,all 24 processes converge to Gaussian distribution(red solid lines). Red dotted lines represent boundary of stable state,−4.22 TECU is lower bound and,4.22 TECU is upper bound.

Fig. 12. Predicted results 24 hours ahead for July 2020. Red line represents results of multi-LSTM model, blue line represents results of the modified multi-LSTM model,green line represents results of IRI-2016 model,the black line represents detected data. (a)From July 1 2020 to July 5 2020;(b)from July 6 2020 to July 10 2020;(c)from July 10 2020 to July 15 2020;(d)from July 16 2020 to July 20 2020;(e)from July 21 2020 to July 25 2020;(f)from July 26 2020 to July 31 2020.

Among the four selected periods with large errors, on June 7, from UTC 7:00 to 23:00 the Dst first increased to 26 nT, then jumped to−7 nT (between the red dotted lines in Fig.10(b))and, from June 6 to 7, Kp index showed an increasing trend, reaching 3.7, the highest value of the month(between the red dotted lines in Fig.10(c)). The geomagnetic disturbance is at a relatively high level on June 7, and figure 10(d) shows this clearly. Corresponding to the magnetic disturbance,in Fig.9(a),the peak value of vTEC during June 7 daytime is 10 to 15 TECU higher than that on the 6th, and the lowest value at night is about 10 to 15 TECU lower than that on 6th.

For about 10 hours from UTC 19:00 on June 19 to UTC 5:00 on June 20, the Dst first dropped from 5 nT to−5 nT,quickly rose to 19 nT,and then dropped to−7 nT(between the green dotted lines in Fig.10(b)). The Kp index fluctuates with Dst, but remains below 3 (between the green dotted lines in Fig.10(c)). The Kp index and Dst are combined in Fig.10(d)to show the geomagnetic disturbance clearly. Corresponding to the magnetic disturbance, in Fig.9(d), from UTC 19:00 to 24:00 on June 19,vTEC is generally lower than the period of UTC 19:00 to 24:00 by about 9 to 11 TECU and, from the UTC 00:00 to 10:00 on June 20, vTEC differs by about 6 to 10 TECU compared with the stable period of the magnetic field at the beginning of June 19.

According to Figs.10(e)and 10(f),and intervals between black dotted lines in Figs. 10(b) and 10(d), from June 8 to 10, the Kp index does not have a particularly high value(below 3), and the fluctuation of the Dst is more stable. Even if there is a 1 nT to−17 nT jump on June 10,the amplitude and duration are less than June 7 and 19 to 20, correspondingly,and the ionosphere is disturbed by geomagnetic field less than June 7 and 19 to 20. In Figs.9(b)and 9(c),vTEC fluctuates at the highest point from June 8 to 9,with a maximum of about 8.7 TECU, while vTEC fluctuates at the highest point from June 9 to 10, with a maximum of about 7.7 TECU, with the duration only 3 hours at the longest.

Based on our analysis and related research,[28]we assume that large and long-term magnetic fluctuations will introduce large and long-term fluctuations to the ionospheric vTEC like June 7 and the case from 19 to 20,which cannot be estimated in advance. However,for most of June 9,10,and 20,the ionospheric fluctuations introduced by the disturbance are small and short and, compared with the same period on June 8, 9,19, the model severely exaggerated the ionospheric fluctuations,resulting in many large errors. Calculate the RMSE values of differences between detected values of June 9, 10, 20 and their previous days,we obtain 3.54,3.18,and 2.81 TECU,which are all less than the RMSE values of differences between predicted values of June 9, 10, 20 and their previous days’detected values(7.03 TECU,7.78 TECU,5.34 TECU).This also supports the model’s exaggeration of the fluctuation of the ionospheric vTEC.

We define the fluctuation(or difference)at the same time on two adjacent days as ionospheric activity. In order to determine the universality of this exaggeration, and minimize it accordingly,we choose an interval of ionospheric activity and make unified evaluation and modification to predicted values outside it,where the selection of the interval boundary corresponds to the standard deviation of the detected values of activity. We define the predicted data falling within the interval to be stable. Defining an activity matrix

We extract 24 random processes represented by each row vector of activity matrix (5) from May 1, 2019 to May 31,2020,and plot their histograms as shown in Fig.11.

All 24 random processes converge to Gaussian distribution, and in particular their expectations converge to nearly 0 (from UTC 00:00 to 23:00 are 0.0022 TECU,−0.0323 TECU,0.0147 TECU,0.0375 TECU,0.0335 TECU, 0.0421 TECU, 0.0389 TECU, 0.0293 TECU,−0.0187 TECU, 0.0029 TECU,−0.0104 TECU,0.0055 TECU, 0.0054 TECU, 0.026 TECU, 0.0041 TECU,−0.0041 TECU,−0.3432 TECU,−0.0225 TECU,−0.0196 TECU,0.0383 TECU,0.0355 TECU,0.0087 TECU,−0.0015 TECU,−0.0042 TECU) and, their standard deviations are 5.09 TECU, 4.78 TECU, 5.44 TECU, 6.56 TECU,7.35 TECU, 8.38 TECU, 8.96 TECU, 9.44 TECU,9.31 TECU, 9.81 TECU, 8.53 TECU, 7.54 TECU,6.30 TECU, 5.38 TECU, 4.88 TECU, 4.89 TECU,4.62 TECU, 4.64 TECU, 4.59 TECU, 4.04 TECU,3.88 TECU,4.08 TECU,4.40 TECU,5.04 TECU(from UTC 0:00 to 23:00). At low latitudes, the daytime ionosphere is affected by x-ray, ultraviolet ray, equatorial anomaly and so on, making standard deviation of its activity abnormally large, which may hide some predicted values that need to be corrected. Therefore, to bring more values into consideration, we exclude daytime, only select 4.22 TECU, mean value of standard deviation values before sunrise in Fig.11,as the boundary of stable interval of activity (shown in Fig. 11,where−4.22 TECU is the lower bound, 4.22 TECU is the upper bound). This boundary will change with each sliding window of the training set,and is updated every month.

Define a predicted activity matrix as

A sliding window with a length of 30 days and a stride of one-time step is defined, where the first sliding window is June 2020. Then, all values(in sliding window)that are predicted outside the interval[−4.22,4.22],are determined,and a total of 276 results are obtained. In order to modify the model for the general situation, the mean absolute errors (MAEs)for the 276 predicted results (defined as MAEpredicted, whose initial value is 8.75 TECU) and 276 actual results (defined as MAEdetected, whose initial value is 4.95 TECU) are calculated. According to the comparison between MAEpredictedand MAEdetected, it can be seen that the exaggeration is common in the first sliding window, when these predicted values with large errors are concentrated on June 9, 10, and 20, the obtained prediction results exhibit large deviations from the actual situation.

where vTECt′′k,nis the final predicted result.µis not a constant value,it changes with each sliding window step.

Fig. 13. RMSE values of the predictions of different models with respect to the detected values (24 hours ahead). Red solid line represents results of multi-LSTM model, blue solid line represents results of modified multi-LSTM model,green solid line represents results of IRI-2016 model,red dotted line represents mean value of multi-LSTM model results (5.1 TECU),blue dotted line represents mean value of modified multi-LSTM model results (4.4 TECU), green dotted line represents mean value of multi-LSTM model results(5.9 TECU).

Figure 12 shows 24-hour outputs of the multi-LSTM model, modified multi-LSTM model and,the calculation results of the IRI-2016 model (from https://ccmc.gsfc.nasa.gov/cgi-bin/modelweb/models/) and the detected results (July 2020). The red line represents the multi-LSTM model, the blue line represents the modified multi-LSTM model, the green line represents the IRI-2016 model. Compared with the IRI-2016 model,the multi-LSTM and modified multi-LSTM models are more sensitive to irregular jitter thus,the prediction results of these models are closer to the actual situation. Figure 13 shows the RMSE values of the multi-LSTM, modified multi-LSTM, and IRI-2016 models with respect to the detected results. As can be seen, the IRI-2016 model is the least accurate,with about 5.9 TECU as the mean RMSE in July. In contrast, the multi-LSTM model achieves about 5.1 TECU and the modified multi-LSTM approximately 4.4 TECU. Furthermore, our correction method significantly reduces the RMSE values of the multi-LSTM model for 19 days,improving the accuracy of the multi-LSTM model.

To further understand the effect of model modification under different geomagnetic conditions, we insert F10.7, Dst and Kp×10 indexes of July 2020 in Fig. 14. It can be seen from Fig.14(a)that solar extreme ultraviolet radiation in July is still relatively stable,and four geomagnetic disturbances exist.

Fig. 14. F10.7, Dst and Kp×10 indexes on July 2020. (a) F10.7 index in July 2020;(b)Dst in July 2020;(c)Kp index in July,2020.

The first geomagnetic disturbance occurred from June 30 to July 1. According to Figs. 10(b) and 14(b), from the end of June 30 to the beginning of July 1,Dst has a jump of about 30 nT, resulting in similar vTEC fluctuations as June 7. Reflected in Fig.13,model modification decreased the RMSE by only less than 1 TECU on July 1. The second geomagnetic disturbance occurred from July 3 to July 8 (Figs. 14(b) and 14(c), from DOY 185 to 190), Dst jumped about 20 nT, no large vTEC fluctuations were introduced, and the prediction error was also low, with no negative impact on model modification. The third and the fourth geomagnetic disturbances occurred from July 14 to 16(Figs.14(b)and 14(c),from DOY 196 to 198) and July 24 to 28 (Figs. 14(b) and 14(c), from DOY 206 to 210), respectively. These two magnetic disturbances are larger than the previous ones,with the jump amplitudes of Dst exceeds 60 nT,and the Kp index reaching 3 to 4.On the corresponding days, although the model modification slightly amplifies the RMSE values (in Fig. 13, July 15, 16,27,28),the amplified RMSE values are all within 0.5 TECU,which has a limited impact on the overall prediction accuracy.

In summary, the model modification is more suitable for most of the time when the geomagnetic disturbance is low or geomagnetic field is stable. During large geomagnetic disturbances, our modification method does not introduce large errors,which increases the practicality of the model.

4. Conclusion

The aim of this study was to develop a more accurate LSTM model for predicting long-term (more than 24 hours ahead) ionospheric vTEC in low-latitude regions. First, we changed the sequence of the input by dividing one day into 24 nodes with a 1-hour time resolution. This eliminated the diurnal characteristic, allowing the model to directly predict inner streams and local perturbations. Every time the multi-LSTM model is executed,predicted results for at least one day can be obtained. We obtained true 24-hour and 48-hour results, but the accuracy was limited by severe fluctuations in the ionosphere at low latitudes. Therefore, we modified the multi-LSTM model based on a defined validation set with sliding window,quantified the ionospheric activity and divided it into active and stable states, and utilized a correction coefficient defined using the MAE to reduce the impact of misjudgments on results predicted to be the active state. The modified multi-LSTM model achieved a greater prediction accuracy: the average RMSE values on the test set (July, 2020)were higher than that of the multi-LSTM and the IRI-2016 models,i.e.,about 1.5 TECU higher than the IRI-2016 model and 0.7 TECU higher than the multi-LSTM model. Our future plan is to improve the modification method for this multi-LSTM model and execute the modified multi-LSTM model for years,including both the descending and ascending phases of the solar cycle. Moreover,we intend to explore the impacts of sliding window length,seasonality,and extreme space weather on the predicted results and ultimately establish a completely ionospheric prediction model suitable for different space conditions.

Acknowledgments

We would like to thank Data Center for Meridian Space Weather Monitoring Project (NSSC, CAS) for providing detected ionospheric vTEC data, and to thank Space Physics Data Facility of Goddard Space Flight Center(NASA)for providing calculated Dst data and calculated ionospheric vTEC data. We would like to thank Zhengbo Wang,Jianwei Zhang,and Yuhang Li for feedback on our manuscript. Project supported by the National Key Research and Development Program of China (Grant No. 2016YFA0302101) and the Initiative Program of State Key Laboratory of Precision Measurement Technology and Instrument.