APP下载

Partition coefficient prediction of Baker's yeast invertase in aqueous two phase systems using hybrid group method data handling neural network

2017-05-28CarlosEduardodeArajoPadilhargioDantasdeOliveiraniorDomingosFabianodeSantanaSouzaJacksonArajodeOliveiraGoreteRibeirodeMacedoEveraldoSilvinodosSantos

Carlos Eduardo de Araújo Padilha,Sérgio Dantas de Oliveira Júnior,Domingos Fabiano de Santana Souza,Jackson Araújo de Oliveira,Gorete Ribeiro de Macedo,Everaldo Silvino dos Santos*

Laboratory of Biochemical Engineering,Chemical Engineering Department,Federal University of Rio Grande do Norte(UFRN),Natal,RN,Brazil

1.Introduction

Invertase(β-fructofuronoside fructohydrolase;E.C.3.2.1.26)is a glucoenzyme that catalyzes the hydrolysis of sucrose producing an equimolar mixture of glucose and fructose.It can be found in animal,superior plants, filamentous fungi,yeast and bacteria[1–5].The inverted sugar,obtained from the hydrolysis reaction,has a lower crystallinity than sucrose therefore it can be used in the food industry in order to obtain fresh and soft products.Also,it has been used in the production of fruit-jelly,candy,chocolate and biscuits[6,7].

The partition using Aqueous Two Phase System(ATPS),a kind of liquid–liquid extraction,has been used as a powerful tool during the separation and purification protocol of proteins,nucleic acids,microorganisms as well as plants and animal cells[4,8,9].When compared to the other techniques used in downstream processing ATPS shows many advantages such as a lower cost,reduced interfacial strength,biocompatibility,non-toxicity as well as the possibility of continuous process operation and facility to scale-up[4,10–12].ATPS is mainly built up by mixture of two hydrophilic polymers(e.g.PEG/Dextran)or from a polymer and an inorganic salt with a higher ionic strength(e.g.PEG/potassium phosphate,PEG/sodium citrate,PEG/ammonium sulfate,etc.).After mixing,larger aggregates are built up and the system components will tend to separate in two phase mainly by steric exclusion[13].Most studies with ATPS exploited the effects under partition caused by the physical–chemical properties of the solute such as hydrophobicity,charge,size,concentration and bioaffinity,or by changing the system parameters such as kind,constituents,tie-line length(TLL),pH and temperature[14].In fact,the mechanisms responsible by the biomolecules partition using ATPS are quite complex as well as are not easily predicted.The interactions between the biomolecules and the surrounding phase happen mainly due to Van der Waals forces,hydrogen bond,hydrophobic interaction and also electrostatic forces[13].Even though there has been an advance on the thermodynamic models applied to ATPS such as Flory-Huggins,Wilson,UNIFAC[11,15–17],all of them show limitations in predicting the partition behavior of proteins from broth.

On the other hand,models based on intelligent systems are playing an important role in the resolution of problems in the field of science and engineering.They have been used when the time and efforts spent to resolution of sophisticated equations as well as the use of simplified theory are not able to predict the data satisfactorily[18–22].The Group Method Data Handling(GMDH)is an inductive modeling method based on the Backpropagation Artificial Neural Network(BPANN)as well as in the natural selection concept[23].The GMDH organizes their own architecture by using a heuristic method of self-organization as well as they can determine the processing layer number and the useful input variables.As a result,a high-order polynomial expression which establishes relevant links between input and output variables is available to the user[24].Thus,this algorithm has been used for modeling,prediction,data mining and system identification[24–28].The mathematical simplicity of GMDH as well as the wide availability of neural network software has increased the interest of research groups regarding the direct modeling of liquid–liquid separation processes[29],including the prediction of partition coefficients of biomolecules[20,30].

In this sense,the objective of this paper was to develop a hybrid GMDH neural network for predicting partition coefficients of invertase from Baker's yeast using a PEG/MgSO4-ATPS.Partition experimental data were used in the training of hybrid GMDH neural network in which its performance was compared to both an original GMDH network and a BPANN by using different statistical metrics.

2.Material and Methods

2.1.Material

Polyethylene glycol(PEG)with an average molar mass of1500,4000 and 6000;hydrochloric acid,sodium hydroxide,monobasic potassium phosphate anhydrous,dibasic potassium phosphate anhydrous,magnesium sulfate heptahydrate and manganese sulfate monohydrate were acquired from Synth(Diadema,São Paulo/Brazil)and used in PEG/MgSO4-Aqueous Two Phase Systems.Baker's yeasts were acquired from a local supplier.Double distilled water used in the experiments was from a MilliQ system.

2.2.Baker's yeast extract

Baker's yeast was suspended with 0.2 mol·L−1acetate pH 4.0–5.0 and 0.2 mol·L−1phosphate pH 6.0–7.0 buffers then stirred using a magnetic stirrer for 30 min.The suspension was sonicated for 5 min(Ultrasonic Cleaner,Unique)followed by centrifugation(Centrifuge 5804 R,Eppendorf)at930×gfor 15 min at4°C.The supernatant was withdrawn and freezer stocked at−4 °C until the use in the ATPS experiments.

2.3.Aqueous Two Phase Systems—ATPS

Stock solutions of PEG 1500,4000 and 6000(50wt%)as well as MgSO4(25wt%)were prepared using double distilled water.In order to avoid protein precipitation,the solutions containing PEG,MgSO4,the co-solute MnSO4and double distilled water were previously mixed using a conical tube then the cell homogenate was added,according to Karkaş and Önal[4].The system pH was adjusted with 3 mol·L−1HClor3 mol·L−1NaOH.The total weight of the phase system was 10 g and operational conditions used can be observed in Table 1.Tubes were stirred during 45 s at 25°C and then centrifuge at 930×gfor 15 min at 25°C.The phases were collected using an automatic pipette followed by the enzymatic activity assay.

Partition coefficient is defined as the ratio of the enzymatic activity in the organic(top)phase and enzymatic activity in the aqueous(bottom)phase,as shown in Eq.(1).

AtandAbare the enzymatic activity(U·ml−1)in the PEG-rich phase and the salt-rich phase,respectively.

2.4.Enzymatic activity assay

Invertase activity was determined by measuring the quantity of reducing sugars formed during the hydrolysis of sucrose using the 3,5-dinitrosalicylic acid(DNS)method[31,32].Reaction was carried out using 0.6 ml of 0.2 mol·L−1acetate buffer(pH 5.0),0.2 ml of 0.5 mol·L−1sucrose and 0.2 ml of the invertase solution incubated at 37°C for 30 min.Then,1.0 ml of DNS reagent was added followed by boiling for 10 min.The samples were cooled at room temperature and the reducing sugars were recorded using a spectrophotometer(Thermo Spectronic model Genesys 10UV)at 540 nm.One invertase activity was de fined as the amount of enzyme which released 1.0 μmol of reducing sugars,in terms of glucose,by minute atpH 5.0 and 37°C.For the experiments the initial activity of cell homogenate was 13.4 U·ml−1.

3.Model Formulation

3.1.GMDH

The GMDH algorithm is based on the most suitable selection of the quadratic polynomial expressions generated from the connection of two independent variables.Every interaction a new neuron layer is built and also increasing the order of the polynomial expressions.Generic connections between the input and output variables can be expressed by complex polynomial series such as Volterra–Kolmogorov–Gabor(VKG)[20,23,28,30,33],as shown in Eq.(2).

X=(1,x1,x2,...,xN)is the input variables vector,a is the weights vector and^yis the predicted output.The generic VKG equation can be simplified to quadratic expressions of only two variables(Eq.(3)),with coefficients given by the column vector a(Eq.(4)).

Every node leads to a set of coefficients a in which it can be estimated by the training group using the ordinary least squares(OLS)method.The generalization capacity of the network is evaluated by comparing the predicted data with the testing data using any statistical metrics.In the present work,the selection criteria used for the neurons selection has been the absolute average relative deviation(AARD)in percentage,as described in Eq.(5).

In order to determine the best results,the value ofEwas minimized by taking as nullits derivative concerned to everyaicoefficient(Eq.(6)).

Resolution of Eq.(6)is a typical problem of minimization with constraints that can be represented by a set of linear equations.Therefore by using the training data,according to Eqs.(7)and(8),the direct solution can be obtained throughout Eq.(9).

Table 1Partitioning values obtained by hybrid and original GMDH neural network

3.2.Hybrid GMDH neural network

In the originalGMDHnetwork the nodes in every layer are the product of two candidates.In this approach the effect of others singular variables are neglected as well as the combinatory effect between them.This leads to the generation of less accuracy to the further nodes as well as reducing the capacity of describing systems with higher level of nonlinearity.In order to overcome the performance of the GMDH models itwas proposed the hybridization of GMDH and traditional neural network.In this case,each node is generated by any combination of the inputs up to polynomial order two,as shown in Eq.(10).Other change used to enhance the model complexity is based on the cross of the nodes of different layers.As the number of possible combinations among the nodes increases the proposed model can follow-up better the trends of nonlinearity of the systems[20,30].Similar to the original GMDH network the estimation of the parameters(a)in the hybrid GMDH network has been carried out by the OLS method.

4.Results and Discussion

A hybrid GMDH neural network has been used in order to predict the partition coefficients of invertase from Baker's yeast in the PEG/MgSO4Aqueous Two Phase System.A set of 67 runs has been used for the training step(66.7%of overall data)as well as the testing step(33.3%of overall data).The input variables for the hybrid GMDH neural network were the molar average mass of PEG(x1),pH(x2),PEG(x3),MgSO4(x4),cell homogenate(x5)and MnSO4(x6).In order to observe the real effect of each input variable,all input values were normalized in the range of zero to one.Partition coefficients of invertase were the output variable of the model.In Table 1 the operational conditions of runs,the experimental as well as the predicted partition coefficients obtained using both hybrid and original GMDH neural networks can be found.

In this study,the structure of hybrid and original GMDH neural networks were developed with four and five layers of neurons,respectively.As can be seen in Fig.1,the hybrid network has six neurons at the input layer,two middle layers(with four and two neurons respectively),followed by output layer.

Fig.1.Structure of the hybrid(a)and original GMDH(b)neural networks proposed.

Unlike the conventional GMDH,it can be observed that in the proposed hybrid network there is crossing of the nodes from different layers,as for instance,the interaction of inputx4andx5at 2nd middle layer.The expressions generated concerned to each node at the layers as well as the total correlation function of hybrid and original GMDH neural networks are exhibited in Tables 2 and 3.In the hybrid GMDH model was observed high linear contributions ofx1andx4on theKvalues.This may be because the molecular weight of PEG and MgSO4strongly affected the phase behavior and,consequently,the enzyme partitioning[13,34].On the other hand,the input variables of original GMDH model have the same intensity impact on the response.Moreover,the effects of the cell homogenate were not incorporated in this model.

Table 2Node expressions for the hybrid GMDH neuralnetwork used to predict the partition coefficients of invertase using ATPS

Table 3Node expressions for the original GMDH neural network used to predict the partition coefficients of invertase using ATPS

The statistical metrics Absolute Fraction of Variance(R2),Root Mean Square Error(RMSE),Mean Square Error(MSE),Mean Absolute Deviation(MAD)concerned to the training step as well as the testing step obtained to the hybrid GMDH neural network are reported in Table 4.

As can be seen in Table 4 the hybrid GMDH neural network model has shown a good adequacy as well as prediction accuracy to predict the partition coefficients of invertase from Baker's yeast in the PEG/MgSO4ATPS.Also,the model showed a good generalization capacity since the statistical criteria values RMSE,MSE and MAD for the testing group were lower than the values for the training group.

Comparison of the experimental and predicted values for the partition coefficients is shown in Fig.2.

Table 4Statistical criteria for training and testing of the hybrid GMDH neural network

Fig.2.Partition coefficients predicted versus experimental for both hybrid GMDH neural network and original GMDH neural network.

As can be observed in this figure,the results for the proposed hybrid model were quite good even though some deviation can be observed to higher partition coefficients.Overall,partition coefficients lower than one were observed,showing that invertase moved to the salting phase.In fact,according to Karkaş and Önal[4]invertase shows preference to the bottom phase after the addition of the co-solute.

The performance of the hybrid GMDH model has been also accomplished by the absolute average deviation in percentage(AARD),according to Eq.(5).A model originated from a BPANN with two processing layers each with 10 neurons has been used in order to compare with both the hybrid GMDH and original GMDH neural networks.According to Table 5,it can be observed that the AARD concerned to hybrid GMDH model is lower than the other two models,despite the smaller number of parameters involved.Therefore it shows the best fitting of this model to predict the partition coefficients of invertase of Baker's yeast in the PEG/MgSO4system.Similarly,Pazuki and Kakhki[20]showed that the hybrid GMDH model was superior to GMDH and UNIFAC-FV approaches in predicting the coefficient partition of Penicillin G Acylase in the PEG/potassium phosphate and PEG/sodium citrate.Compared to the other AARD reported in literature the data of the network are larger than those.These results can be justified by the larger range of the partition coefficients value as lower as 0.004 and as higher as 0.171,i.e.,about forty-three fold higher as can be observed in Fig.2.

Table 5Performance comparison of the hybrid GMDH,original GMDH and BPANN models

5.Conclusions

A hybrid GMDH neural network built-up with three layers and nine neurons was used to predict the partition of invertase Baker's yeast in PEG/MgSO4Aqueous Two Phase Systems.The network structure allowed verifying interactions with more than two input variables byturns as well as the crossing of the neurons from different layers then showing a higher model complexity.Despite the nonlinearity degree,the hybrid model has a quite good generalization capacity,when comparing theR2,RMSE,MSE and MAD values of training and testing steps.It was also shown that proposed model has better prediction performance than both the original GMDH model and the BPANN,in terms of AARD.In general,the Hybrid GMDH neural network is a powerful tool to predict the partition coefficients of invertase in ATPS and appears as an interesting option for data treatment of other Aqueous Two Phase Systems.

Acknowledgments

The authors thank CAPES and Brazilian National Council of Research(CNPq)(Grant 407684/2013-1)for the financial support.

[1]S.Talekar,V.Ghodake,A.Kate,N.Samant,C.Kumar,S.Gadagkari,Preparation and characterization of cross-linked enzyme aggregates ofSaccharomyces cerevisiaeinvertase,Aust.J.Basic Appl.Sci.4(2010)4760–4765.

[2]Z.Lazar,E.Walczak,M.Robak,Simultaneous production of citric acid and invertase byYarrowialipolyticaSUC+transformants,Bioresour.Technol.102(2011)6982–6989.

[3]M.C.Madhusudhan,K.S.M.S.Raghavarao,Aqueous two phase extraction of invertase from Baker's yeast:Effect of process parameters on partitioning,Process Biochem.46(2011)2014–2020.

[4]T.Karkaş,S.Önal,Characteristics of invertase partitioned in poly(ethylene glycol)/magnesium sulfate aqueous two-phase system,Biochem.Eng.J.60(2012)142–150.

[5]G.E.A.Awad,H.Amer,E.W.El-Gammal,W.A.Helmy,M.A.Esawy,M.M.M.Elnashar,Production optimization of invertase byLactobacillus brevismm-6 and its immobilization on alginate beads,Carbohydr.Polym.93(2013)740–746.

[6]E.J.Tomotani,M.Vitolo,Production of high-fructose syrup using immobilized invertase in membrane reactor,J.Food Eng.80(2007)662–667.

[7]M.Plascencia-Espinosa,A.Santiago-Hernández,P.Pavón-Orozco,V.Vallejo-Becerra,S.Trejo-Estrada,A.Sosa-Peinado,C.G.Benitez-Cardoza,M.E.Hidalgo-Lara,Effect of deglycosylation on the properties of thermophilic invertase purified from the yeastCandida guilliermondiiMplla,Process Biochem.49(2014)1480–1487.

[8]A.S.Schmidt,A.M.Ventom,J.A.Asenjo,Partitioning and purification of α-amylase in aqueous two-phase systems,Enzym.Microb.Technol.16(1994)131–142.

[9]D.Z.Wei,J.H.Zhu,X.J.Cao,Enzymatic synthesis of cephalexin in aqueous two-phase systems,Biochem.Eng.J.11(2002)95–99.

[10]B.R.Babu,N.K.Rastogi,K.S.M.S.Raghavarao,Liquid–liquid extraction of bromelain and polyphenol oxidase using aqueous two-phase system,Chem.Eng.Process.47(2008)83–89.

[11]S.Shahriari,V.Taghikhani,M.Vossoughi,A.A.Safekordi,I.Alemzadeh,G.R.Pazuki,Measurement of partition coefficients of β-amylase and amyloglucosidase enzymes in aqueous two-phase systems containing poly(ethylene glycol)and Na2SO4/KH2PO4at different temperatures,Fluid Phase Equilib.292(2010)80–86.

[12]L.Ferreira,X.Fan,L.M.Mikheeva,P.P.Madeira,L.Kurgan,V.N.Uversky,B.Y.Zaslavsky,Structural features for differences in protein partitioning in aqueous dextran-polyethylene glycol two-phase systems of different ionic compositions,Biochim.Biophys.Acta1844(2014)694–704.

[13]J.A.Asenjo,B.A.Andrews,Aqueous two-phase systems for protein separation:A perspective,J.Chromatogr.A1218(2011)8826–8835.

[14]I.Yücekan,S.Önal,Partitioning of invertase from tomato in poly(ethylene glycol)/sodium sulfate aqueous two-phase systems,Process Biochem.46(2011)226–232.

[15]H.Hartounian,E.W.Kaler,S.I.Sandler,Aqueous two-phase systems.2.Protein partitioning,Ind.Eng.Chem.Res.33(1994)2294–2300.

[16]P.A.Pessôa Filho,R.S.Mohamed,Thermodynamic modeling of the partitioning of biomolecules in aqueous two-phase systems using a modified Flory–Huggins equation,Process Biochem.39(2004)2075–2083.

[17]S.Gautam,L.Simon,Prediction of equilibrium phase compositions and βglucosidase partition coefficient in aqueous two-phase systems,Chem.Eng.Commun.194(2006)117–128.

[18]A.M.F.Fileti,G.A.Fischer,E.B.Tambourgi,Neural modeling of bromelain extraction by reversed micelles,Braz.Arch.Biol.Technol.53(2010)455–463.

[19]J.Luo,W.Lin,X.Cai,J.Li,Optimization of fermentation media for enhancing nitriteoxidizing activity by artificial neural network coupling genetic algorithm,Chin.J.Chem.Eng.20(2012)950–957.

[20]G.Pazuki,S.S.Kakhki,A hybrid GMDH neural network to investigate partition coefficients of Penicillin G Acylase in polymer-salt aqueous two-phase systems,J.Mol.Liq.188(2013)131–135.

[21]S.Atashrouz,G.Pazuki,Y.Alimoradi,Estimation of the viscosity of nine nanofluids using a hybrid GMDH-type neural network system,Fluid Phase Equilib.372(2014)43–48.

[22]F.Parvizian,M.Rahimi,S.M.Hosseini,Prediction of the characteristics of a new sonochemical reactor using an expert model,Chem.Eng.Commun.203(2016)683–691.

[23]A.G.Ivakhnenko,Polynomial theory of complex systems,IEEE Trans.Syst.Man Cybern.1(1971)364–378.

[24]S.Z.Reyhani,H.Ghanadzadeh,L.Puigjaner,F.Recances,Estimation of liquid–liquid equilibrium for a quarternary system using the GMDH algorithm,Ind.Eng.Chem.Res.48(2009)2129–2134.

[25]S.Ketabchi,H.Ghanadzadeh,A.Ghanadzadeh,S.Fallahi,M.Ganji,Estimation of VLE of binary systems(tert-butanol+2-ethyl-1-hexanol)and(n-butanol+2-ethyl-1-hexanol)using GMDH-type neural network,J.Chem.Thermodyn.42(2010)1352–1355.

[26]H.Ghanadzadeh,M.Ganji,S.Fallahi,Mathematical model of liquid–liquid equilibrium for a ternary system using the GMDH-type neural network and genetic algorithm,Appl.Math.Model.36(2012)4096–4105.

[27]T.Kondo,J.Ueno,S.Takao,Hybrid multi-layered GMDH-type neural network using principal component regression analysis and its application to medical image diagnosis of liver cancer,Procedia Comput.Sci.22(2013)172–181.

[28]C.E.A.Padilha,C.A.A.Padilha,D.F.S.Souza,J.A.Oliveira,G.R.Macedo,E.S.Santos,Prediction of rhamnolipid breakthrough curves on activated carbon and amberlite XAD-2 using artificial neural network and group method data handling models,J.Mol.Liq.206(2015)293–299.

[29]M.Moghadam,S.Asgharzadeh,On the application of artificial neural network for modeling liquid–liquid equilibrium,J.Mol.Liq.220(2016)339–345.

[30]S.Abdolrahimi,B.Nasernejad,G.Pazuki,Prediction of partition coefficients of alkaloids in ionic liquids based aqueous biphasic systems using hybrid group method of data handling(GMDH)neural network,J.Mol.Liq.191(2014)79–84.

[31]G.L.Miller,Use of dinitrosalicylic acid reagent for determination of reducing sugar,Anal.Chem.31(1959)426–428.

[32]C.F.Assis,L.S.Costa,R.F.Melo-Silveira,R.M.Oliveira,H.A.O.Rocha,G.R.Macedo,E.S.Santos,Chitooligosaccharides antagonize the cytotoxic effect of glucosamine,World J.Microbiol.Biotechnol.28(2012)1097–1105.

[33]A.Shabri,R.Samsundin,A hybrid GMDH and Box-Jenkins models in time series forecasting,Appl.Math.Sci.8(2014)3051–3062.

[34]L.Ferreira,P.P.Madeira,L.Mikheeva,V.N.Uversky,B.Zaslavsky,Effect of salt additives on protein partitioning in polyethylene glycol-sodium sulfate aqueous twophase systems,Biochim.Biophys.Acta1834(2013)2859–2866.