Physics-constrained neural network for solving discontinuous interface K-eigenvalue problem with application to reactor physics

2023-12-05QiHongYangYuYangYangTaoDengQiaoLinHeHeLinGongShiQuanZhang

Nuclear Science and Techniques 2023年10期

Qi-Hong Yang·Yu Yang·Yang-Tao Deng·Qiao-Lin He·He-Lin Gong·Shi-Quan Zhang

Abstract Machine learning-based modeling of reactor physics problems has attracted increasing interest in recent years.Despite some progress in one-dimensional problems, there is still a paucity of benchmark studies that are easy to solve using traditional numerical methods albeit still challenging using neural networks for a wide range of practical problems.We present two networks, namely the Generalized Inverse Power Method Neural Network (GIPMNN) and Physics-Constrained GIPMNN(PC-GIPIMNN)to solve K-eigenvalue problems in neutron diffusion theory.GIPMNN follows the main idea of the inverse power method and determines the lowest eigenvalue using an iterative method.The PC-GIPMNN additionally enforces conservative interface conditions for the neutron flux.Meanwhile, Deep Ritz Method (DRM) directly solves the smallest eigenvalue by minimizing the eigenvalue in Rayleigh quotient form.A comprehensive study was conducted using GIPMNN,PC-GIPMNN, and DRM to solve problems of complex spatial geometry with variant material domains from the field of nuclear reactor physics.The methods were compared with the standard finite element method.The applicability and accuracy of the methods are reported and indicate that PC-GIPMNN outperforms GIPMNN and DRM.

Keywords Neural network·Reactor physics·Neutron diffusion equation·Eigenvalue problem·Inverse power method

1 Introduction

In the nuclear engineering domain, the fundamental mode solution of the K-eigenvalue problem based on the steadystate multigroup neutron diffusion theory is crucial for simulation and analysis of nuclear reactors.The eigenvalue equation can be expressed as follows:

whereφg≡φg(r)denotes the neutron flux at spatial point r in theg-th energy group,Dg,ΣRg,Σg′→g,χgandνΣfg′denote spatially dependent(possibly discontinuous)parameters that reflect the material properties in a reactor core[1].Nuclear engineers and analysts must numerically determine the fundamental mode eigenvalue (commonly called Keff) and the corresponding eigenvector for a given geometry/material configuration.For numerical solution, Eq.(1)must be discretized and reduced to a set ofG-coupled algebraic equations, which can be expressed using matrices as follows:

This is often termed as the generalized eigenvalue problem because coefficient matrices occur on both sides of the equation.Many mature numerical methods, such as the finite difference method[2],nodal collocation method[3],finiteelement method [4–6], and nodal expansion method [7–9],have been proposed to solve neutron diffusion equations.Among the methods,the nodal expansion method is widely used because it is easier to implement and requires less computational effort than other methods.The power iteration method is the most well-known numerical method for solving the principal K-eigenvalue[10].A detailed review of conventional numerical methods for solving K-eigenvalue problems is available in recent nuclear reactor physics textbooks[1,11,12].Although traditional and mature numerical methods are currently widely used in nuclear reactor physics to solve Keigenvalue problems with acceptable engineering accuracy and computational cost, state-of-the-art neural networks to solve K-eigenvalue problems are still in the infancy stages.However, neural networks exhibit the potential to provide another option to solve nuclear engineering problems with the progress of new algorithms and hardware.

As mentioned above, many traditional numerical methods were proposed to solve neutron diffusion equations.However, a dense mesh is required to ensure highly accurate results.The model consumes more computational resources if the mesh is extremely dense.Inaccurate solutions are obtained if the mesh is coarse.Moreover, for high-dimensional problems,classical methods are either less efficient or not successful due to the problems involving dimensionality.A neural network can potentially enhance its performance by using a capable hypothesis space due to its relatively low statistical error[13].

Neural networks exhibit multiple potential benefits in solving nuclear engineering problems when compared with conventional numerical methods.

• They provide mesh-free solutions to approximate physics fields in nuclear reactors,in which mesh generation is significantly complex due to high heterogeneity of geometry and material.

• They provide a general framework to solve highdimensional problems governed by parameterized PDEs,particularly for original neutron transport equation.This equation comprises seven variables: three spatial variables,two directional variables,one energy variable,and one temporal variable.

• They are able to seamlessly incorporate prior data(potentially with noise)into existing algorithms given that data assimilation is necessary as a post-process for conventional numerical methods[14–19].

• They provide a general framework to solve inverse problems with hidden physics.Conversely,they are typically prohibitively expensive and require different formulations and elaborate computer codes for conventional numerical methods.

In recent years,neural networks have been widely used to solvePartialDifferentialEquations(PDEs)[20]andachieved remarkable success.

Basedonneuralnetworks,alargenumberofmethodswere proposed to solve PDE,such as the deep backward stochastic differential equation(BSDE)method[21,22],deep Galerkin method (DGM) [23], deep Ritz method (DRM) [24], and Physics-Informed Neural Network (PINN) [25].The deep BSDE method reformulates PDEs using backward stochastic differential equations and approximates the gradient of an unknown solution using neural networks.Although DGM and PINN appear independently under two names in the literature, they are similar.They both train neural networks by minimizing the mean squared error loss of the residual equation.However, DRM reformulates the original problem into a variational problem and trains neural networks by minimizing the energy function of the variational problem.Specifically,DRMandPINN[25]attractedwidespreadattention.Extensive studies focused on these methods to solve a variety of problems, including fiber optics [26], hyperelasticity [27], solid mechanics [28], heat transfer problems[29–31],inverse problems[32–35],and uncertainty quantification [36–39].However, a few studies focused on solving eigenvalue problems[24,40–45].

The need to solve eigenvalue problems can be traced back to 2018[24].A deep Ritz method to solve variational problems was proposed, and several examples elucidated on how to use DRM to solve eigenvalue problems [24].First,the original eigenvalue problem was transformed into a variational problem.Then, a specially defined loss function was constructed,termed as the Rayleigh quotient,using the variational principle [46].The Rayleigh quotient is a well-known approximation of the eigenvalue of matrixA,which is defined byFinally,they minimized the loss function and obtained the smallest eigenvalue.Similarly,some studies [41, 42] directly used PINN to solve eigenvalue problems.In contrast to DRM transferring the original problem to a variational problem, the PINN solves eigenvalue problems without variation.Neural networks are used in PINN to represent the function,and automatic differentiation(AD)[47]is used to acquire the vector impacted by the differential operators.The loss function is obtained using the Rayleigh quotient.The smallest eigenvalue and corresponding eigenvector were then solved using optimization tools.Moreover,the deep forward-backward stochastic differential equation(FBSDE)method[48]was proposed to solve eigenvalue problems, which is an expansion of the deep BSDE method.In the method,the eigenvalue problem is reformulated as a fixed-point problem of semigroup flow induced by the operator.Other studies[43,44]proposed an alternative method to learn the eigenvalue problem by adding one or two regularization terms to the loss function.More recently,Ref.[49],a neural network framework based on the power method[50]was presented to solve eigenvalue problems and smallest eigenvalue problems, where the eigenfunction is expressed by the neural network and iteratively solved following the idea of the power method or inverse power method.However, the scope was limited to linear operators and certain special eigenvalue problems.

In a recent study,PINN was applied to solve neutron diffusion equations [45, 51], where the authors used a free learnable parameter to approximate the eigenvalue and a novel regularization technique to exclude null solutions from the PINN framework.A conservative physics-informed neural network (cPINN) was proposed in discrete domains for nonlinear conservation laws[52].Moreover,cPINN[53]was applied to solve heterogeneous neutron diffusion problems in one-dimensional cases,which develops PINN for each subdomain and considers additional conservation laws along the interfaces of subdomains(a general consideration in reactor physics [11]), which is involved in neural network training as the variable to be optimized.

More recently, a data-enabled physics-informed neural network (DEPINN) [40] was proposed to solve neutron diffusion eigenvalue problems.To achieve acceptable engineering accuracy for complex engineering problems, it is suggested that a very small amount of prior data from physical experiments be used to improve the training accuracy and efficiency.In contrast to PINN,which solves the neutron diffusion eigenvalue problem directly,an autoencoder-based machine learning method in combination with the reducedorder method[54,55]was proposed[56]to solve the same problem.However,it still relies on solving governing equations with traditional numerical methods such as the finite difference method.

Although DRM provides a way to solve eigenvalue problems with neural networks, as shown in Sect.4.2.2,results indicate that DRM is not stable when solving twodimensional cases.First, DRM learns the eigenvalue and eigenfunctionattheearlystageofthetrainingprocess.Subsequently,DRM attempts to learn a smaller eigenvalue after it is closetothetrueeigenvalue.Finally,DRMsuccessfullylearns one smaller eigenvalue that may be close to the true eigenvalue and one incorrect eigenfunction that is meaningless and far from the true eigenfunction.Additionally,the framework of a neural network based on the power method[50]is unsuitable for generalized eigenvalue problems.Therefore, it is necessary to propose a new algorithm to solve K-eigenvalue problems.

The study focuses on eigenvalue problems,which are also interface problems in which the eigenfunctions are continuous on the interface,and the derivatives of the eigenfunction are not continuous at the interface.Specifically,in the nuclear reactor physics domain, this is a general problem in which the reactor core is composed of fuel assemblies of different fissile nuclides enrichments [1].Some studies focused on the use of neural networks to solve elliptic interface problems.Some researchers [57] use the idea of DRM and formulated the PDEs into the variational problems, which can be solved using the deep learning approach.They presented a novel mesh-free numerical method for solving elliptical interface problems based on deep learning [58].They employed different neural networks in different subdomains and reformulated the problem as a least-squares problem.A similar case exists, in which the authors [53]enforce the interface conditions using piecewise neural network.In contrast to the methods, a discontinuity-capturing shallow neural network(DCSNN)[59]has been proposed for elliptic interface problems.The crucial concept of DCSNN is that ad-dimensional piecewise continuous function can be extended to a continuous function defined in(d+ 1)-dimensional space,where the augmented coordinate variable labels the pieces of each subdomain.However,to the best of the authors’ knowledge, only a few studies focused on the use of neural networks to solve eigenvalue problems that incorporate interface problems involving regions of different materials.However,challenges exist at least on three fronts.

• Designing a neural network that is more suitable for Keigenvalue problems for more complicated/medium-size test problems.

• Dealing with the interface problem in a more general and understandable manner when designing the neural network.

• Proposing a framework effectively enhances the robustness of the neural network and improves the efficiency of utilizing the noisy prior data.

To address the aforementioned challenges and advance beyond the state-of-the-art research[45,51,53],we initially introduced the study [40].This served as a preliminarily demonstration of the applicability of the PINN approach to reactor physics eigenvalue problems in complex engineering scenarios.The contributions of this study are as follows.

• Firstly,we extend the Inverse Power Method Neural Network(IPMNN)[49]to the so-called Generalized Inverse Power Method Neural Network (GIPMNN) to solve the smallest eigenvalue and the related eigenfunction of the generalized K-eigenvalue problems.Compared to DEPINN from our previous study [53], we omit the prior data in the training process and attempt to solve the K-eigenvalue problems from a data-free mathematical/numerical perspective.

• Then,we propose a Physics Constrained GIPMNN(PCGIPMNN) to address the interface problem in a more general and understandable manner with respect to previous studies[45,51,53].

• Finally, we conduct a thorough comparative study of GIPMNN, PC-GIPMNN, and DRM using a variety of numerical tests.We evaluate the applicability and accuracy of these three methods using typical 1D and 2D test cases in reactor physics,particularly accounting for material discontinuities in different geometries.In the 1D example,we determine the optimal ratio of outer to inner iterations, a finding that may be particularly relevant for GIPMNN.Additionally,we observe the failure of DRM in the 2D experiments, whereas PC-GIPMNN consistently outperforms GIPMNN and DRM over a set number of epochs.

The rest of this paper is organized as follows.The governing equations for the eigenvalue problems are presented in Sect.2.In Sect.3, we propose GIPMNN and PC-GIPMNN and introduce DRM in our cases.In Sect.4, the results of 1D and 2D test cases are listed to verify the three methods.Finally, conclusions and future research are discussed in Sect.5.

2 K-eigenvalue problems

This section introduces the equations that govern the neutron criticality over a spatial domain.We recall Eqs.(1)and(2).The generalized K-eigenvalue problem can be formulated as follows:

where domainsΩ⊂Rd,L,Q,andBdenote the differential operators acting on the functions defined in the interior ofΩand at the boundary ofΩ(∂Ω).Furthermore,φdenotes the eigenfunction of the system andλdenotes the associated eigenvalue.In this preliminary study,inspired by the notable work in[56],we utilize the one-group steady-state diffusion equation for criticality,framing it as a generalized eigenvalue problem.It is expressed as:

where eigenfunctionφdenotes the neutron flux,which is a scalar quantity used in nuclear reactor physics.It corresponds to the total length travelled by all free neutrons per unit time and volume.Σa,Σf,andvdenote the absorption cross section, fission cross section, and average number of neutrons produced per fission event,respectively.We follow[56]and use the diffusion coefficient approximation

Two main boundary conditions are imposed on the diffusion equation.One condition represents a surface on which neutrons are reflected back into the domain(reflective condition),and the other condition represents surfaces that allow neutrons to escape from the system(vacuum or bare condition).Both the conditions are satisfied by relating the flux solution to its gradient on the boundary,i.e.,wherendenotes an outward point normal to the surface.

2.1 PINN as a Eigenvalue solver

In this subsection,we discuss using PINN to solve the generalized eigenvalue problem(Eq.(2)).

The eigenfunction of the operatorLis approximated byNθ,i.e.,φ(x)=Nθ(x).Then,LφandBφcan be computed using the AD.For the boundary conditions:A penalty term(Eq.(8))is added to the loss function in PINN,which penalizes the discrepancy between the approximated value on the boundary and exact boundary condition, whereNbdenotes the number of sampling points on the boundary∂Ωandxiis one point in the sampling set{xi}iNb=1.

As mentioned previously, we are concerned with the smallest eigenvalue and associated eigenfunction.Hence,the eigenvalue (Rayleigh quotient) is viewed as a loss term in PINN to attain the lowest eigenvalue.

Finally,the total loss function(Eq.(10))in PINN corresponds to the weighted sum of the two objectives (Eq.(8))and(Eq.(9)),whereαandβcorresponds to the weights.

Unfortunately, PINN does not work for the cases in the study and even performs worse than DRM.Therefore, we only compared the results of our method with those of DRM.

Remark 1It should be noted that the square of the eigenvalue is used as the loss function in PINN.Given that the smallest eigenvalue implies that the absolute value is the smallest,PINN attempts to determine a negative infinity value without using a square term.

3 Methodologies

In this section, we extend our previous study [49] and discuss the use of a neural network to numerically solve the smallest eigenvalue problem.The main concept is to use a neural network to approximate the eigenfunction and compute the eigenvalue based on the Rayleigh quotient using an eigenvector expressed by the points calculated using the eigenfunction.

3.1 Neural network architecture

Next,we discuss the neural network structure used to approximate eigenfunctionφ(x).The neural network architecture employed in the study is the same as ResNet[60],which is built by stacking several residual modules.It is one of the most popular models used in Deep Learning, and it is also commonly used in the field for solving PDEs via neural networks[24,61,62].Each module has one skip connection,and each block consists of two fully connected layers as shown in Fig.1.

In the network, letx,xk∈Rdbe the input and letWlandbl,l= 1,2,3,4 and 5 be the parameters in the fully connected layers.We useWrkandbrkandk= 1 and 2 to denote the parameters in the residual connections.The resultss1ands2for the modules can be expressed as follows:

Therefore, the eigenfunctionφ(x) =Nθ(x), whereθdenotes neural network parameters.

Fig.1 Neural network architecture consists of two modules and one linear output layer.Each module contains two fully-connected layers and a skip connection

Remark 2Given thatφ(x)is the scalar of the neutron population and denotes the density of the free-moving neutron distribution over the spatial domain, we obtainφ(x) ≥0.Therefore, the eigenfunction can be expressed asφ(x) =(Nθ(x))2.

3.2 Recap of inverse power method neural network

We consider the following linear eigenvalue problem:

which differs from the generalized eigenvalue problemLφ=λQφ.

Inspired by the idea of the inverse power method,IPMNN[49] was proposed to solve for the smallest eigenvalue and associated eigenfunction.Equation(16)depicts the key step of the inverse power method,and equation(17)is analogous to equation (16), whereAdenotes a matrix, andλk-1andφk-1denote the results of previous iteration.Therefore,λkandφkare obtained by using Eq.(16).

Here,the neural networkNθrepresents the approximated eigenfunctionΦkat thekth iteration of Eq.(17).Given thatΦk-1,which is obtained from the last iterative step and follows the main idea of inverse power method,we must solveΦkusingPk=L-1Φk-1andΦk=Pk/‖Pk‖.However,it is difficult to obtain the inverse operatorL-1for the differential operatorL.Therefore,Φkis obtained without knowing the inverse operator.The termLΦkis computed using AD in Eq.(17).The main idea is that it is not necessary to calculateL-1,and the eigenfunction can be approximated iteratively by minimizing the defined loss (18) to approach Eq.(17),wherexi∈Sis the dataset, andNdenotes the number of points inS.The eigenvalue in thekth iteration is obtained by using(19).

3.3 Generalized inverse power method neural network

In standard nuclear engineering procedures, we normally employ the inverse power method to solve for the smallest eigenvalue and associated eigenvector, which relies on the discretization of Eq.(3).IPMNN [49] was proposed to solve the smallest eigenvalue problem,which is a mesh-free method realized by a neural network.However,the method is restricted to solving the equationLφ=λφ,which is a simple form.Therefore,we propose GIPMNN to solve Eq.(3).Details of algorithm of GIPMNN is presented in Algorithm 1.

First, the generalized inverse power method is used to solve Eq.(20).The key step to focus on is given in Eq.(21),whereAandBare two matrices andλk-1andφk-1are the results of the previous iteration.Therefore,λkandφkare obtained by using Eq.(21).

In GIPMNN, Eq.(22) is an analog of Eq.(21), whereLandQdenote linear differential operators implemented by AD as opposed to specially discretized matrices.In a manner similar to the generalized inverse power method,we record the resultsλk-1of previous iteration.In contrast to the generalized inverse power method,instead of recordingφk-1, we recordQΦk-1.It should be noted thatΦk-1denotes the eigenfunction represented by the neural network in(k-1)th iteration,andQΦk-1is realized by AD.Ink-th iteration,we directly computeΦkusing the neural network,i.e.,Φk=Nθ, and calculateLΦkusing AD.We obtainΦkdirectly through the neural network instead of solving the equationLΦk=λk-1QΦk-1,We define the loss functionLossgipmnnin Eq.(23)to propel the neural network to learnΦk.When the neural network converges,we obtain the smallest eigenvalue,and the associated eigenfunction can be expressed using a neural network.

Remark 3In the algorithm of GIPMNN,given that the initial functionΦ0is not represented by the neural network,it is not possible to obtainQΦ0using the AD.Therefore,we chose an arbitrary functionQΦ0.

The Neumann and Robin boundary conditions were used for the eigenvalue problem.It is difficult to enforce them by encoding the boundary conditions into a neural network,as in [49, 63, 64], where the Dirichlet and periodic boundary conditions are used.

We form the loss functionLossbin Eq.(8), whereNbdenotes the number of sampling points on∂Ω, andxiis a point in the sampling set{xi}iNb=1.

Therefore, the loss function is defined in Eqs.(24) and(25) as follows: where the surface indicates that neutrons are reflected back into the domain or that the surface allows neutrons to escape from the system.

Algorithm 1:GIPMNN for Finding the Smallest Eigenvalue Given N denotes the number of points for training the neural network, Nb denotes the number of points on the boundary ∂Ω,Length denotes a measure of ∂Ω, Nepoch denotes the maximum number of epochs,and ∊denotes the stopping criterion Step 1:Build data set S for the loss of the Rayleigh quotient and data set Sb for the loss of boundary.Step 2:Initialize a neural network with random initialization of parameters.for k =1,2,··· ,Nepoch do Input all points in S and Sb into neural network Nθ.Let Φk(xi)=Nθ(xi),where xi ∈Simages/BZ_189_833_765_850_800.pngimages/BZ_189_840_738_874_773.pngSb;Lossdrm =α +β Length Nbimages/BZ_189_812_787_849_822.pngiNb=1|BΦk(xi)-g(xi)|2.Update parameters θ of the neural network via gradient descent.θk+1 =θk-η∇θ J(θk),where η is the learning rate and θk is the vector of parameters in k-th iteration.if Lossdrm <∊then Record the best eigenvalue and eigenfunction.If the stopping criterion is met,then the iteration can be stopped.end end

The total loss function (Eq.(26)) denotes the weighted sum of the objectives(Eq.(23))and(Eq.(8)).For the process of GIPMNN,refer to Algorithm 1.

Remark 4In Eq.(26),αandβdenote the weights of the two losses.It is noted thatβ≥αwhen training the neural network,particularly in the method GIPMNN.Given this issue,it is easy for a neural network to determine the eigenvalue and eigenfunction.

3.4 Physics constrained generalized inverse power method neural network

Although GIPMNN can solve the eigenvalue problems of Eq.(3),it is still difficult to solve eigenvalue problems with discontinuous coefficients in different regions.We discuss interface problems,which implies that the eigenfunction may be continuous at the interface, and the derivatives of the eigenfunction may not be continuous at the interface.The enforcement of interface conditions is very important for GIPMNN.

In the study,inspired by the idea of a piecewise deep neural network [58], we propose a PC-GIPMNN to solve eigenvalue problems with interface conditions.However,instead of employing different neural networks in different subdomains,we use only one neural network and multiple neurons in the output layer,as shown in Fig.2.It should be noted that each neuron in the output layer corresponds to a subdomain.Therefore,we can obtain outputs in different subdomains that can be used to enforce the conditions at the interface.

Suppose that there are two domainsΩlandΩr, with an interfaceΓ, which is the cross region between the two domains.Given the properties of the neutron populationφ(x),we can summarize that the eigenfunction will satisfy two interface conditions, i.e., (27) and Eq.(28), whereφlandφrrepresent the eigenfunctions defined inΩlandΩrm,respectively,andndenotes the normal vector pointing fromΩrtoΩl.DlandDrare the coefficients defined inΩlandΩr,respectively,which are discontinuous at the interface.Equation(27)indicates that the eigenfunction is continuous at the interface,i.e.,the neutron flux is continuous at the interface.Equation(28)indicates that neutron flow is continuous at the interface.

Assume thatSΓcorresponds to the set of points at the interfaceΓand|SΓ|denotes the number of points inSΓ.We then introduce two penalty terms to enforce the two interface conditions:Equation(29)and Eq.(30),wherexi∈SΓ.

Fig.2 Illustration of

PC-GIPMNN architecture diagram.There are multiple neurons in the output layer,which denote the eigenfunctions in different subdomains

By combining (26), (29), and (30), Equation (31) is the total loss defined in our method PC-GIPMNN,whereα,β,γ,andδare the weights of the different losses.In subsequent experiments, we chose 1.In the study, we focused on the proposed algorithms and neglected the influence of weights.We are expecting that our proposed algorithms are universal andachievebetterresults,evenwithoutadjustingtheweights.Therefore,we select all weights as 1.

where,llandlrdenote the indicator functions,ll=1 inΩl,ll=0 inΩr,lr=1 inΩrandlr=0 inΩl.

Remark 5ConservativePINN(cPINN)[53]developedPINN for each subdomain and considered additional conservation law along the subdomains’ interfaces (a general consideration in reactor physics [11]).However, in neural network training,eigenvalue is involved as a variable to be optimized,and the numerical examples that are presented correspond to only one-dimensional cases.Furthermore,the relative errors ofkeffinthesecasesare4.4800×10-04and3.3500×10-04.Similarly,we use Eqs.(29)and(30)to enforce the interface conditions.As shown below,our methods are more generic and yield better results.

Remark 6In[45],the impact of the interface conditions was ignored, and the relative errors ofkeffin their cases corresponded to 1.3 × 10-03and 4.4 × 10-03,respectively,and the study did not involve the smallest eigenvalue problem.In this study,we obtain the lowest eigenvalue using the inverse power method as opposed to using a free learnable parameter to approximate the eigenvalue.Our numerical results demonstrate that accurate results can be obtained in more complicated cases.

3.5 Deep Ritz method

DRM is a deep-learning-based method for numerically solving variational problems [24].It reformulates the original PDEs into equivalent variational equations and defines the loss function based on variational formulations.The solutions of PDEs are represented by a neural network,and the derivatives are calculated using AD.DRM[24]is also used to solve eigenvalue problems.Furthermore,we specify how to use DRM to solve Eq.(3).

We consider the variational principle of the smallest eigenvalues.

Remark 7We enforced the boundary condition by adding a penalty term, (34).However, if the boundary condition is the Neumann or Robin boundary condition, we do not use the penalty term (34) because the boundary condition is incorporated into the Rayleigh quotient based on Green’s first identity[65].

4 Experiments

In this section,we present numerical experiments to compare the applicability and accuracy of GIPMNN, PC-GIPMNN,and DRM for solving the smallest eigenvalue problems in reactor physics.In all the experiments below, we chose the Adam optimizer with an initial learning rate 10-3to minimize the loss function.Furthermore,we trained the neural network with the architecture of ResNet on a server equipped with CentOS 7 system, one Intel Xeon Platinum 8358 2.60-GHz CPU, and one NVIDIA A100 80GB GPU.Moreover,unless otherwise specified,the activation function was selected as the tanh function.

Fig.3 Macroscopic cross-sections for the 1D slab reactor, which is modified based on the figure reported in[66].There are three regions with different materials and functions

4.1 One-dimensional slab reactor

We consider one-dimensional case with a simple slab reactor consisting of a domain bounded by vacuum or bare surfaces,as shown in Fig.3.The length of each slab reactor was 10cm.It consists of fuel and control rod regions labeled 1,2,and 3.The control rods were located between 2.2 and 2.5 cm and between 7.5 and 7.8 cm on thex-axis.

As shown in Fig.3,there were two control rods in the onedimensional slab reactor.Either device can be withdrawn or inserted.Three scenarios were designed to model the reactor and completely simulate the actions of the control rods.The three situations are labeled F1, F2, and F3 in Table 1.They indicate that both rods are withdrawn,only the left rod is inserted, and only the right rod is inserted.We also considered three other problems, R1, R2, and R3 in Table 1.Problem R1 is the same as F1, which is designed to simulate the withdrawal of both control rods.Here,we use R1 to facilitate comparison with R2 and R3.Although problems R2 and R3 denote that all the control rods are inserted,problem R2 resembles heavily inserted control rods,and problem R3 resembles slightly inserted control rods.To generate the necessary data to validate the accuracy of GIPMNN,PC-GIPMNN,and DRM,FreeFEM[67–73]is utilized to solve these problems in Table 1.FreeFEM is a partial differential equation solver for non-linear multi-physics systems using the finite element method.We choose number of cells in thex-directionN= 1001 and mesh sizeΔx=10-3.The solution of the baseline is obtained by FEM,and uniform cells are used to train GIPMNN,PC-GIPMNN,and DRM.

Table 1 Six sets of material cross-sections in the work[66]are used to test GIPMNN and DRM

Remark 8When solving all the parameter-dependent problems below,parametersΣa,Σs,andvΣfare selected based on whether the data pointxbelongs to a related region.For PC-GIPMNN,the data points on the interface are considered to belong to multiple regions simultaneously.Therefore,we can enforce the interface conditions using data points on the interface.

Remark 9As unsupervised algorithms,our methods use only points to train a neural network without any prior knowledge.Therefore,there was no test set for the proposed algorithm.

4.1.1 Using GIPMNN to solve for Eigenvalue

In this one-dimensional example, we select 20 neurons for each hidden layer in ResNet andNepoch=50000.Without a loss of generality,the weightsαandβof the different losses are not adjusted to achieve better results.Therefore,αandβwere set to one.

In Eq.(21), we must solve forφkusing the eigenvaluesλk-1andφk-1.To accelerate the training process and obtain more accurate results using Algorithm 1,we split the iterations in the original Algorithm 1 into inner and outer iterations,which can be observed in Algorithm 3,whereNinnerandNouterdenote the number of inner and outer iterations,respectively.

We chose the ratios of the outer and inner iterations as 1 : 1,1 : 10,1 : 100,1 : 1000,and 1 : 10000 to investigate the effect of the ratio on the results.Ratio 1 :nimplies that theoutercodeisexecutedonce,whereastheinnercodeisexecutedntimes.For comparison,the total number of iterations wasfixedatNtotal=100000 andNtotal=Ninner×Nouter.The relative errors ofkeffand eigenfunction for different ratios of outer and inner iterations during the training process are shown in Fig.4.The results ofkreleffare shown in the first row and those ofφrelare shown in the second row, where the relative errors ofkeffand eigenfunction are calculated by

respectively.As shown in the figures,the best ratio is 1 : 1 because the relative errors ofkeffand eigenfunction of the ratio 1:1 is relatively smaller than the others when training the neural network.The convergence worsens when the ratio of the outer and inner iterations changes from 1 : 1 to 1 :10000.Therefore,we trained GIPMNN using a ratio 1:1 to solve 1D and 2D problems.

Moreover, we fixed the number of outer iterationsNouter= 1000 and retrained the neural network using the ratios.The results are shown in Fig.5.The opposite results are observed when compared with the results in Fig.4.The ratio 1 : 1 is the worst ratio because increases in inner iterations lead to a better approximation of the eigenfunction in the next outer iteration.This is consistent with the results of the inverse power method.

4.1.2 Using PC-GIPMNN to solve for Eigenvalue

We divided the slab reactor into five parts from left to right.The output layer included five neurons, and five functions were defined in different subdomains.Here,u,w, andqdenote the functions defined in Region 1,vandpare those defined in Regions 2 and 3,respectively.

As shown in Fig.3,the four points are denoted asxi1=2.2,xi2=2.5,xi3= 7.5,andxi4= 7.8.The interface conditions(29) and (30) are implemented as loss functions defined in(38)and(39)in the 1D example.

Algorithm 3: Iterations in Algorithm 1 are split into inner iterations and outer iterations.for k =1,2,··· ,Nouter do for j =1,2,··· ,Ninner do Input all points in S and Sb into neural network Nθ.Let Φk(xi)=Nθ(xi),where xi ∈Simages/BZ_192_1946_486_1964_522.pngimages/BZ_192_1954_460_1987_495.pngSb;Lossgipmnn =images/BZ_192_1634_507_1671_542.pngN i=1|LΦk(xi)-λk-1QΦk-1(xi)|2.Lossb =images/BZ_192_1558_557_1595_593.pngNb i=1|BΦk(xi)-g(xi)|2,Losstotal =αLossgipmnn +βLossb,if j = Ninner then Φk-1 =Φk/‖Φk‖.λk-1 = <(LΦ)k,Φk><(QΦ)k,Φk>,end Update parameters θ of the neural network via gradient descent.θk,j+1 =θk,j -η∇θ Losstotal(θk,j),where η denotes the learning rate and θk,j denotes the vector of parameters in(kj)-th iteration.end if Losstotal <∊then Record the best eigenvalue and eigenfunction.If the stopping criterion is met,then the iteration can be stopped.end end

The eigenfunction can be expressed as follows:

wherel1,l2,l3,l4,andl5are indicator functions that are 1 or 0 based on the inputxinside or outside the subdomain.

4.1.3 Using DRM to solve for Eigenvalue

The configuration of DRM is identical to that of GIPMNN.Given that it is a variational formulation,and the boundary condition is incorporated into the Rayleigh quotient,we do not use the penalty term to enforce the boundary condition.The loss functionLossdrmis defined as:

Fig.4 (Color online)Relative error of keff and φ for problems F1,F2,and F3(from(a)to(c))with respect to different ratios of outer and inner iterations during training process.The first row shows the results of and the second row shows the results of φrel.The neural network is trained with Ntotal =100000

Fig.5 (Color online)Relative error of keff and φ for problems F1,F2,and F3(from(a)to(c))with respect to different ratios of outer and inner iterations during training process.The first row shows the results of and second row shows the results of φrel.The neural network is trained with Nouter =1000

whereLengthslabdenotes the length of the slab reactor,xlandxrdenote points that show the positions of the endpoints of the slab reactor,andNdenotes the number of points used to approximate the integral.Therefore, the smallest eigenvalueλcan be obtained after the neural network converges.

4.1.4 Results

The results for the one-dimensional example are listed in Tables 2,3,and 6.The values ofkeffand their relative errors are listed in Table 2.For problems F1 and R1, which have continuous coefficients,the results obtained by GIPMNN are better than those obtained by PC-GIPMNN and DRM.For problem F2,the relative error ofkeffobtained by DRM is better than those obtained by other methods.For other problems with discontinuous coefficients,the results obtained using the PC-GIPMNN are better than those obtained by GIPMNN and DRM.The relative error ofkeffcomputed by PC-GIPMNN is approximately 10-5,which is lower than 10-4as computed by DRM.The relative errors in the eigenfunctions are listed in Table 3.For all problems,PC-GIPMNN can attain better results than GIPMNN and DRM.

In Fig.6,the eigenfunctions of GIPMNN,PC-GIPMNN,and DRM are compared with those of FEM.Specifically,we follow the conventional normalization process in nuclear reactor physics[1],where the normalization constant is generally computed to make the average reactor power equal to unity; thus, the eigenfunctions are normalized by Eq.(42),whereNdenotes the number of training points in the entire domain.

The figures in the first row show the results for problems F1,F2,and F3 and those in the second row show the results for problems R1,R2,and R3.In each figure,we plot the eigenfunction obtained by FEM and compare the relative errors of the eigenfunctions computed by GIPMNN, PC-GIPMNN,and DRM.Evidently,the results obtained by PC-GIMPNN are better than those obtained by the other methods,which is consistent with the results in Table 3.

As reported in a previous study,[49],IPMNN can attain more accurate eigenvalues when compared to those obtained with DRM when linear eigenvalue problems without interface are considered.Therefore, IPMNN and GIPMNN are suitable to determine the eigenfunction of a problem with a strong form, and DRM is similar to FEM in that DRM is applicable for finding the eigenfunction of a problem with a weak form.Consequently, DRM is better than GIPMNNin this one-dimensional example with discontinuous coefficients.However, given that the interface conditions are well implemented in PC-GIPMNN,it successfully learns the eigenvalue and eigenfunction and achieves better results than GIPMNN and DRM,as shown in Tables 2 and 3.

Table 2 Results obtained by GIPMNN,PC-GIPMNN,and DRM compared with those obtained by FEM for problems in Table 1.kreleff denotes the relative error of keff

Table 3 Results obtained by GIPMNN,PC-GIPMNN,and DRM compared with those obtained by FEM for problems in Table 1.φrel denotes the relative error of eigenfunction

Fig.6 (Color online) Eigenfunctions obtained by GIPMNN, PCGIPMNN, and DRM are compared with those obtained by FEM for one-dimensional example.The eigenfunctions are normalized.The first row shows the results of problems F1,F2,and F3,and the second row shows the results of problems R1,R2,and R3

Remark 10Specifically,DRM is applicable to determine the eigenfunction of a problem with a weak form,which implies that the eigenfunction exhibits low regularity.Subsequently,as shown in the implementation of GIPMNN,it is necessary for the eigenfunction obtained from GIPMNN to exhibit high regularity.Therefore,the learned eigenfunction is in a strong form.Hence, GIPMNN is unable to obtain accurate values at the interface.However,PC-GIPMNN does not require the eigenfunction to exhibit a higher regularity at the interface but instead guarantees continuity and physical constraints by realizing the interface conditions.Therefore, PC-GIPMNN successfully learns the eigenvalue and eigenfunction and obtains better results when compared to GIPMNN and DRM.

4.2 Two-dimensional reactor

As shown in Fig.7, a two-dimensional reactor is modeled in a square-shaped domain with 90-cm sides.The reactor was surrounded by a neutron reflector with graphite material,which implies that Robin boundary condition was selected.The main bulk of the reactor corresponds to the fuel region.Within its central region,four control rods can be inserted or withdrawn.When the control rods are withdrawn,materials in the regions are replaced with water, which corresponds to common practice in many reactor designs.Table 4 lists five different materials used in the mock reactor.However,the two types of fuels in Table 4 are designed to simulate different fuel materials.Fuel type 1 defines the standard fuel used in most problems, and fuel type 2 defines an adjusted fuel composition that is different from fuel type 1.Fuel type 2 in problem R7 was used to test whether our methods can be affected by different types of fuels.

Fig.7 Geometry of a 2D reactor core with graphite, fuel, and four control-rod regions labeled as 1,2,3,and 4.The figure is similar to the figure in a previous study[66]

Table 4 Cross section data for various materials of the 2D reactor.This data are similar to those reported in a previous study[66]

As shown in Table 5, there are 12 problems in validating the accuracy of the proposed method.Five full models which are labeled as F1-F5, were proposed to simulate the reactor with all control rods removed and then with only one control rod inserted in the regions of the control rods.As previously discussed,when the control rod was removed,the material in the region was replaced with water.Therefore,W is used to denote water in Table 5.Other seven reactor configurations,denoted by R1-R7,were proposed to simulate the cases where more control rods were inserted.Problems R1 and R2 are equivalent to the full model problems F1 and F2:Problems R3-R6 utilized different combinations of inserted and rejected control rods.It should be noted that in problem R7,the material configuration differs from the material configuration of other problems.The fuel type was replaced with fuel type 2,and the control rods were assumed to be partially inserted,which implies that the materials in the regions corresponded to a mix of control rods and water materials.

4.2.1 Using GIPMNN and PC-GIMPNN to solve for Eigenvalue

The number of pointsN=Nx Nyis used to calculateLossgipmnnand number of pointsNb=2(Nx-2)+2(Ny-2)+4 is used to calculateLossb.The number of neurons is 20 for each hidden layer in ResNet andNepoch=500000 for both GIPMNN and PC-GIPMNN.Without a loss of generality,αandβare set to one.As mentioned previously,we use the optimal ratio of the outer and inner iterations,which is 1:1.

The 2D reactor is divided into six parts,as shown in Fig.7.The output layer includes six neurons,and there are six functions that are defined in different subdomains and are labeled asu,v,w,r,p,andq,whereu,v,w,andrdenote functions defined in CR1, CR2, CR3, and CR4, andpandqdenote functions defined for Fuel and Graphite,respectively.SCR1,SCR2,SCR3,SCR4, andSGFdenote different datasets at different interfaces.

For PC-GIPMNN,interface conditions(29)and (30)are implemented as loss functions(43)and(44)in the 2D example.

Table 5 Configurations for reactor problems with different materials.The configurations are similar to those in a previous study[66].C-R denotes the control rod region,CR denotes a control rod material,and W denotes water

The eigenfunction is expressed as follows:

wherel1,l2,l3,l4,l5andl6denote indicator functions.

Remark 11In the experiment, it was important to use Eq.(42) instead of the originalL2 norm, which is too small to optimize the neural network.Specifically, the L2 norm of the eigenfunction corresponds to one if we attempt to find a normalized eigenfunction.If the total number of pointsNis excessively high,then the value of each component of the eigenvector is excessively low to such an extent that it is difficult for the neural network to learn the eigenfunction.

4.2.2 Using DRM to solve for Eigenvalue and the failure of DRM during the 2D experiments

Given that the homogeneous Neumann boundary condition is used,the loss function in DRM can be defined as:

whereAreasquaredenotes the area of the square domain as shown in Fig.7.

We use the number of epochsNepoch=500000 in DRM.First,we found that the DRM can learn the eigenvalues and eigenfunctions at an early stage of the training process.Subsequently, the DRM attempts to learn a smaller eigenvalue after it is close to the true eigenvalue.Finally,the DRM successfully learns one smaller eigenvalue that may be close to the true eigenvalue and one terrible eigenfunction that is meaningless and far from the true eigenfunction.This phenomenon is illustrated in Fig.8.

As listed in Table 6,although the neural network in DRM fails to learn the eigenfunction, the eigenvalue is close to the true value.This phenomenon may be caused by minimization of Rayleigh quotient.As mentioned in [49], the loss approaches zero,whereas the eigenvalue may not reach zero.

In Fig.8,we observe that the failure of DRM during the 2D experiments occurs afterNepoch≥60000.Therefore,we selectNepoch= 50000 to train DRM again and record the results.In a previous study[40],the stopping criteria of training process for PINN was investigated.In future studies,we will follow the technique discussed in[40]to determine the stopping criteria for DRM.In the next section,we compare the results of the DRM trained withNepoch= 50000 with the results of GIPMNN and PC-GIPMNN.

Table 6 Failure of DRM during the 2D experiments.The eigenfunction of F1-F5 and R1-R7 is not correct,but the eigenvalue is to the true value

Remark 12The numerical results in Fig.8 are not due to hardware problems or code errors but can be attributed to the nature of the neural network.The eigenvalue was approximatedbyconstructingaRayleighquotient intheDRM.Then,the eigenvalue is treated as a loss function to optimize the neural network.However,this mechanism of minimizing the eigenvalue leads to overfitting of the neural network,as the neural network always attempts to find a point where the loss function tends toward zero.

4.2.3 Results

Similar to the results of the one-dimensional case,the relative errors ofkeffand eigenfunctionφare shown in Tables 7 and 8.It can be observed that both the relative errors forkeffobtained via all the methods are small,and the results for the DRM are trained withNepoch=50000.

For problems F1, F2, F4, and R7, the relative error ofkeffobtained by DRM was smaller than that obtained by GIPMNN, except for problems F3, F5, R3, R4, R5, and R6.However, although the relative error ofkeffobtained by GIPMNN is small, the relative error ofφsimulated by GIPMNN is larger than that obtained by DRM.It is observed thatkreleffobtained by the PC-GIPMNN is smaller than that obtained by GIPMNN and DRM for all problems.Furthermore,the relative errors ofφcomputed by the PC-GIMPNN are smaller than those obtained by the GIPMNN and DRM for all problems.Therefore, the PC-GIPMNN can successfully learn eigenvalues and eigenfunctions.

In Fig.9 and 10,the eigenfunctions computed by the FEM are shown in the first column, and the relative errors of the eigenfunctions obtained by the GIPMNN,PC-GIPMNN,and DRM are shown in the other columns for different problems.It is observed that the relative errors of the eigenfunction computed by the PC-GIPMNN and DRM are smaller than those obtained by the GIPMNN,which failed to learn some details.Compared to the eigenfunctions computed by the FEM,the results obtained by the PC-GIPMNN are the best among the three methods.

Table 7 Results obtained via GIPMNN,PC-GIPMNN,and DRM compared with FEM for problems in Table 5,kreleff denotes the relative error of keff.Especially,DRM is trained with Nepoch =50000

Table 8 Results obtained via GIPMNN,PC-GIPMNN,and DRM when compared with FEM for problems in Table 5,φrel denotes the relative error of the eigenfunction.Especially,DRM is trained with Nepoch =50000

4.3 2D IAEA benchmark problem

We also considered the classical 2D IAEA benchmark problem reported in the study by Yang et al.[40], which was modeled using two-dimensional and two-group diffusion equations.Here,one-group neutron diffusion equation,defined in Eq.(4) is considered, and multigroup problems are considered in our future study.Its geometry is shown in Fig.11.The main bulk of the reactor consists of two fuel regions, labeled 1 and 2,representing the two types of fuel materials.Within its central region, there are four control rods,which are all labeled as 3.The last region,labeled 4,was composed of water.Cross-sectional data for the 2D IAEA benchmark problem are presented in Table 9.It is worth noting that only one quarter of the reactor is shown in this figure because the rest can be inferred by symmetry along thexandy-axes.Therefore,this 2D IAEA benchmark problem is confined to the two types of boundary conditions defined in Eq.(7).The problem is confined to the Neumann boundary condition on thex- andy-axes and to the Robin boundary condition on the other boundaries.

4.3.1 Using GIPMNN and PC-GIMPNN to solve for EigenvalueThe number of pointsN=Nx Nyis used to calculateLossgipmnn.The number of pointsNNbon thex- andyaxes and number of pointsNRbon the other boundaries were used to calculateLossNbandLossRb,which enforced the Neumann and Robin boundary conditions.The number of neurons was 20 for each hidden layer in ResNet andNepoch= 500000 for GIPMNN and PC-GIPMNN.As mentioned previously,we used the optimal ratio of the outer and inner iterations,which was 1:1.

The 2D reactor is divided into seven parts, as shown in Fig.11.The output layer has seven neurons,and seven functions are defined in different subdomains and labeled asu,v,w,r,p,q,andh,whereu,v,w,andrdenote the functions defined in the control rods andp,q,andhare the functions defined in the fuel and water regions,respectively.Sup,Svp,Swp,Srp,Srq,Spq,andSqhdenote different datasets at different interfaces.

For the PC-GIPMNN, the interface conditions (29) and(30) are implemented as the loss functions (47) and (48),respectively,in the 2D example.

Fig.9 (Color online)First column shows the heatmap of the eigenfunction of FEM(the first column)and the other columns show the heatmaps of the relative error of GIPMNN (the second column), PC-GIPMNN(the third column),and DRM(the fourth column)for problems F1,F2,F3,F4,and F5.Evidently,GIPMNN less favorable results than DRM.However,by enforcing the interface conditions,PC-GIPMNN outperforms GIPMNN and DRM,as shown in the third column

The eigenfunction can be expressed as follows:

wherel1,l2,l3,l4,l5,l6,andl7are the indicator functions.

4.3.2 Using DRM to solve for Eigenvalue

Given that the homogeneous Neumann boundary condition is used,the loss function in DRM omits the impact of the Neumann boundary condition and focuses on the Robin boundary condition.The loss function is defined as follows:

Fig.10 (Color online) First column shows the heatmap of the eigenfunction of FEM (the first column) and the other columns show the heatmaps of the relative error of GIPMNN (the second column),PC-GIPMNN (the third column), and DRM (the fourth column) for problems R3, R4, R5, R6, and R7.Obviously, GIPMNN yields less favorable results than DRM.However,by enforcing the interface conditions, PC-GIPMNN outperforms GIPMNN and DRM, as shown in the third column

whereAreadenotes the area of all regions,Lengthindicates the length of the boundaries other than the x-and y-axes in Fig.11,andNRbdenotes the number of points on the Robin boundary.

4.3.3 Results

As discussed above,we also trained the DRM with the number of epochsNepoch= 500000 and found that DRM failed

Fig.11 (Color online) Geometry of the 2D IAEA benchmark problem with two fuel regions,four control-rod regions,and a water region labeled as 1,2,3m and 4.This figure is similar to the figure reported in a previous study[40]

Table 11 Results obtained via GIPMNN, PC-GIPMNN, and DRM when compared with FEM for the 2D IAEA benchmark problem.φrel denotes the relative error of the eigenfunction.Especially, DRM is trained with Nepoch =50000

For the 1D slab reactor, the computation time of FEM is 1.25 s, and the training times of DRM, GIPMNN, and PC-GIPMNN are 4788.37 s, 10614.57 s, and 16052.16 s,respectively.For the 2D reactor, the computation time of FEM is 3.64 s, and the training times of DRM, GIPMNN,and PC-GIPMNN are 5827.91 s,18444.39 s,and 108072.41 s,respectively.For the 2D IAEA benchmark, the computation time of FEM is 5.22 s,and the training times of DRM,GIPMNN,and PC-GIPMNN are 29352.83 s,64546.74 s,and 137812.92 s,respectively.

Although PC-GIPMNN was better than DRM and GIPMNN, it required significantly more training time to obtain accurate results.Compared with FEM,neural network methods require excessive time.However, neural networks are an emerging method and we believe that they will achieve better results in the near future.

Table 9 Cross section data for the 2D IAEA benchmark problem

5 Conclusion

In this study,we proposed two methods,GIPMNN and PCGIPMNN, to solve generalized K-eigenvalue problems in nuclear reactor physics.We also conducted a comprehensive study of GIPMNN,PC-GIPMNN,and DRM.GIPMNN follows the main idea of the inverse power method to find the smallest eigenvalue.The PC-GIPMNN enforce interface conditions through multiple neurons in the output layer.The concept of DRM is to define the function of the Rayleigh quotient and form an optimization problem.Unlike DRM solving for the smallest eigenvalue by directly minimizing the eigenvalue (Rayleigh quotient), the GIPMNN and PCGIPMNN attain the smallest eigenvalue using the iterative method.All the methods used neural networks to represent functions and the differential was implemented using AD.Finally,we applied these three methods to problems in reactor physics.

Three numerical experiments were conducted to verify the applicability and accuracy of the GIPMNN, PC-GIPMNN,and DRM.In the first 1D example,we used inner and outer to learn the eigenfunction again.In this case, DRM attains a goodkeff= 0.9750 and the relative errors ofkeffandφare 6.4023 × 10-03and 0.9557,respectively.Therefore,we retrained the DRM withNepoch=50000 and stored the best results.

The relative errors ofkeffand eigenfunctionφare shown in Table 10 and Table 11.It was observed that all three methods obtained good results,and the relative errors ofkeffof DRM were small.However,the ability of DRM to learn the eigenfunction was worse than that of GIPMNN and PC-GIPMNN and the relative errors ofφwere the largest.Thus, it can be concluded that the PC-GIPMNN successfully learned the eigenvalues and eigenfunctions.The same conclusion can be drawn from the graphs in Fig.12.iterations for the simulation.According to our test,the best ratio of outer and inner iterations was 1 : 1.Furthermore,we compared the results of the GIPMNN, PC-GIPMNN,and DRM with those of the FEM.For the continuous problem,the solution learned by the GIPMNN was more accurate than those learned by the DRM and PC-GIPMNN.For interface problem, the eigenvalue and eigenfunction learned by PC-GIPMNN were better than that learned by DRM and GIPMNN.This is due to the interface conditions that are implemented in the loss function of PC-GIPMNN.

Table 10 Results obtained via GIPMNN,PC-GIPMNN,and DRM when compared with FEM for the 2D IAEA benchmark problem.Specifically, denotes the relative error of keff.Especially,DRM is trained with Nepoch =50000

The bold numbers are the best results

Problem keff(FEM) keff(GIPMNN) keff(PC-GIPMNN) keff(DRM) krel eff(GIPMNN) kreleff(PC-GIPMNN) kreleff(DRM)IAEA 0.9688 0.9692 0.9691 0.9685 4.0370 × 10-04 3.0812 × 10-04 2.8218 × 10-04

Fig.12 (Color online) First column shows the heatmap of the eigenfunction of FEM (the first column) and the other columns show the heatmaps of the relative error of GIPMNN (the second column), PCGIPMNN(the third column),and DRM(the fourth column)for the 2D IAEA benchmark problem.However,by enforcing the interface conditions,PC-GIPMNN outperforms GIPMNN and DRM,as shown in the third column

In the 2D examples,we observed the failure of DRM on the 2D experiments.The DRM can learn the eigenvalue and eigenfunction at the early stage of the training process,and the results decrease whenNepochincreases.Therefore, we selectedNepoch= 50000 to train the DRM and compare the results obtained by the GIPMNN, PC-GIPMNN, and DRM with those obtained by the FEM.The results show that the PC-GIPMNN outperforms the GIPMNN and DRM for the results of eigenvalue and eigenfunction.Moreover,the GIPMNN and PC-GIPMNN are more stable than the DRM.

Although good results were obtained,there are still some aspects that need to be examined in the future.First, given that the architecture of a neural network significantly influences the accuracy of our methods,it is important to design a universal network architecture to achieve high accuracy.For example,MLP is widely used in the current field of solving PDEs using neural networks, and ResNet [60] is effective in improving the convergence rate and may even obtain better results than MLP.Recently,a transformer[74]was used for operator learning and better results were obtained.Sifan Wang et al.[62] investigated the effect of MLP on operator learning and proposed a modified MLP[61],which is a new architecture of MLP that improves accuracy.Although GIPMNN and PC-GIPMNN are more stable than DRM and PC-GIPMNN is more accurate than DRM, they require a large number of iterations and a long training time to achieve good results.Therefore,improving convergence and reducing training time will be investigated in our future study.Furthermore,the failure of DRM on 2D experiments on the eigenvalue problem should be studied and clarified.Finally,for the interface problem,a suitable sampling algorithm can facilitate the training process and provide a better approximation.The next study could also involve the implementation of the proposed networks in emerging reactor digital twins[75–77]as core solvers.

Author ContributionsAll authors contributed to the study conception and design.Material preparation,data collection and analysis were performed by Qi-Hong Yang, Yu Yang, Yang-Tao Deng, Qiao-Lin He,He-Lin Gong, and Shi-Quan Zhang.The first draft of the manuscript was written by Qi-Hong Yang and all authors commented on previous versions of the manuscript.All authors read and approved the final manuscript.

Declarations

Conflict of interestThe authors declare no conflict of interest.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License,which permits use,sharing,adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article’s Creative Commons licence,unless indicated otherwise in a credit line to the material.If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyright holder.To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.