APP下载

Intelligent Design of High Strength and High Conductivity Copper Alloys Using Machine Learning Assisted by Genetic Algorithm

2024-05-25ParthKhandelwalHarshitandIndranilManna

Computers Materials&Continua 2024年4期

Parth Khandelwal ,Harshit and Indranil Manna,3,⋆

1Metallurgical&Materials Engineering Department,Indian Institute of Technology,Kharagpur,West Bengal,721302,India

2Computer Science&Engineering Department,Indian Institute of Technology,Kharagpur,West Bengal,721302,India

3Vice Chancellor Office,Birla Institute of Technology(BIT)Mesra,Ranchi,Jharkhand,835215,India

ABSTRACT Metallic alloys for a given application are usually designed to achieve the desired properties by devising experiments based on experience,thermodynamic and kinetic principles,and various modeling and simulation exercises.However,the influence of process parameters and material properties is often non-linear and non-colligative.In recent years,machine learning(ML)has emerged as a promising tool to deal with the complex interrelation between composition,properties,and process parameters to facilitate accelerated discovery and development of new alloys and functionalities.In this study,we adopt an ML-based approach,coupled with genetic algorithm(GA)principles,to design novel copper alloys for achieving seemingly contradictory targets of high strength and high electrical conductivity.Initially,we establish a correlation between the alloy composition(binary to multi-component)and the target properties,namely,electrical conductivity and mechanical strength.Catboost,an ML model coupled with GA,was used for this task.The accuracy of the model was above 93.5%.Next,for obtaining the optimized compositions the outputs from the initial model were refined by combining the concepts of data augmentation and Pareto front.Finally,the ultimate objective of predicting the target composition that would deliver the desired range of properties was achieved by developing an advanced ML model through data segregation and data augmentation.To examine the reliability of this model,results were rigorously compared and verified using several independent data reported in the literature.This comparison substantiates that the results predicted by our model regarding the variation of conductivity and evolution of microstructure and mechanical properties with composition are in good agreement with the reports published in the literature.

KEYWORDS Machine learning;genetic algorithm;solid-solution;precipitation strengthening;pareto front;data augmentation

1 Introduction

Applications in fields such as electrical contacts,switches,switchgear,connectors,electrodes,relays,and circuit breakers necessitate an amalgamation of exceptionally elevated electrical conductivity,and robust mechanical strength,while also incorporating resistance to wear and oxidation[1].For example,lead frames of integrated circuits require a material having an electrical conductivity of more than 50%IACS(International Annealed Copper Standard)and tensile strength above 800 MPa[2].Similarly,contact wires for high-speed trains require a material having an electrical conductivity of more than 60%IACS and strength above 500 MPa[2,3].In general,high-conductivity metals find widespread applications in automobiles,manufacturing units,electrical machines,aircraft,military hardware,and energy generation units.Copper offers the second highest electrical conductivity (or second lowest electrical resistivity of 16.78 nΩ·m at 20°C)and is frequently employed as a conductive medium for wiring and windings,within electrical machinery,electronic circuitry,power generation,transmission,distribution systems for telecommunication networks and an array of diverse electrical apparatus and functionalities [4,5].However,pure copper is highly ductile and suffers easy wear,oxidation,fatigue,and creep damage.Thus,a series of binary/multicomponent copper alloys have been developed(e.g.,Cu-Mg[3],Cu-Cr-Zr[4],Cu-Ni-Si[5,6],Cu-Ni-Si-Cr[7],Cu-Zr[8],Cu-Sn-Zn[9],Cu-Ni-Zn[10],Cu-Be[11],Cu-Cr[12],Cu-Fe-P[13,14],etc.)for various conductor applications that derive mechanical strength from strain,solid solution,precipitation or dispersion hardening while offering an optimum combination of electrical conductivity and mechanical strength.The intrinsic correlation between electrical conductivity and mechanical reinforcement,employing one of the aforementioned mechanisms,typically manifests as a reciprocal phenomenon.This duality arises due to the introduction of structural imperfections,like solute atoms,dislocations,and grain/phase boundaries,or the incorporation of precipitates,dispersoids,and emergent phases.These structural irregularities serve as focal points for both electron scattering,leading to a substantial diminution in electrical conductivity,and stress concentration regions,contributing to heightened mechanical robustness [15,16].Thus,increasing mechanical strength and wear resistance without deteriorating electrical conductivity is a major challenge.

Besides very high electrical and thermal conductivity,copper is widely available,non-hazardous,easy to manufacture,and offers excellent ductility,malleability,corrosion resistance and scope for tailoring its mechanical properties.Therefore,copper-based metallic alloys appear as an ideal choice for obtaining the most optimized combination of electrical and mechanical properties.Many studies in the past have pursued this objective and reported interesting results and theories[2,17,18].Numerous alloying elements have been explored (e.g.,Cr,Zr,Ni,Si,Mg,Zn,Sn,Ti,Al,Co) through various processing routes [13,19–23] and both adverse and favorable effects on electrical conductivity and mechanical strength have been reported [19,20,24–28].Strengthening has been attempted through strategies of stacking fault energy modulation[29],precipitate hardening[17,26,30],grain refinement[12],nano-twin formation[8],etc.Most of these studies were experimental in nature and hence,were time-consuming and expensive[7,13,14,23,31–36].

In general,designing and predicting the properties of an alloy is of paramount interest in material science and engineering for accelerating materials discovery,at an affordable time and cost [15,37].Predicting the structural properties of a given alloy through the experimental route requires an understanding of the complex physical interaction of atoms/ions/molecules subjected to mechanical activation,dislocation dynamics through strain fields,and intricate precipitate-matrix interface evolution in course of thermal/thermo-mechanical processing.Machine learning is recognized as a robust technique to correlate complex non-linear interrelations feasibly and effectively [30,38–41].Therefore,it has found its way in accelerating materials design and development as it can bypass the complex intermediate pathway and can directly learn the non-linear correlation between composition and properties [30,42–44].ML has successfully showcased its strength in numerous research areas like the development of low-cost,low-modulus bone-like material [38],optimization of processing route for advanced inorganic materials(MoS2)[45],property enhancement for carbon quantum dots[38],the discovery of new materials(metallic glasses[45],high entropy alloys[46–50]shape memory alloys [51]).In the recent past,attempts have been made to synthesize the high-strength conductor using ML[50,52].Wang et al.[10]laid the stepping stone by effectively designing the complex highperformance copper alloy using a property-oriented ML design system.Pan et al.[26] synthesized high-performance Cu-Ni-Co-Si alloy by optimizing the composition and explained the relation between the process parameters and microstructure including phase evolution.Zhang et al.[18]designed a high-performance Cu-1.3Ni-1.4Co-0.56Si-0.03Mg alloy by combining ML with a key screening factor and Bayesian optimization.Ozerdem et al.[9]employed a multi-layer artificial neural network to predict the mechanical properties(yield strength,tensile strength,and elongation)of the Cu-Sn-Pb-Zn-Ni system.

It is known that ML approaches rely on several hyper-parameters,which require tuning to ensure the accuracy of prediction.In this context,genetic algorithm-based optimization is considered to be a powerful technique for obtaining the global optimum.GA adopts population-based methods,unlike commonly used conventional gradient optimization techniques which are prone to coincide with the local optima.GA represents a robust metaheuristic optimization technique adept at optimizing multi-dimensional features[39],well-suited for tackling intricate nonlinear and nonconvex problems[47].GA offers several advantages compared to conventional optimization algorithms,notably its adaptability to various optimization scenarios and potential for parallelization [53].In this study,a particularly real-coded GA routine has been utilized.It employs continuous numerical representations to efficiently explore solution spaces,enabling optimization for complex real-world problems with diverse variables.Through the utilization of real number encodings,this algorithm presents a versatile optimization approach that proves highly effective for tasks such as fine-tuning parameters in ML and addressing complexities in engineering design.However,certain considerations warrant attention,including the meticulous formulation of the fitness function,determination of the population size,selection of critical parameters such as mutation and crossover rates,and the selection criteria of the new population [53].Its versatile applicability spans diverse domains,including graph coloring,pattern recognition,travelling salesman problems,and efficient design of airfoils [53],etc.Various researchers have utilized ML-assisted GA in various applications like the accelerated discovery of nanostructured alloy [39],designing of new molecules with valid chemical molecular structure [54],prediction of atmospheric corrosion depth of steel and zinc [54],designing of new medium carbon steel by predicting mechanical properties from its composition and heat treatment[55]and prediction of elevated temperature constitutive flow behavior of 42CrMo steel[45].

The present study aims to utilize ML as a guiding tool to design novel copper alloys with optimized composition and desired set of properties for specific applications.Here we have built robust ML models coupled with GA-based optimization to predict the mechanical (hardness and ultimate tensile strength) and electrical (electrical conductivity) properties from composition while overcoming the trade-off between mechanical and electrical properties.Next,to realize the challenging task of composition-to-property prediction,it is first important to develop a model that can directly map and correlate the chosen composition to the corresponding properties with high precision.Here the challenge arises from the increased dimensionality of output (composition) hyperspace while attempting to converge and predict the optimum alloy composition with limited data.This problem has been successfully resolved in this work by adopting a novel composition-design technique,and an appropriate combination of data segregation and data augmentation by ML.Eventually,this model,which was built to meet the previous objectives,further enabled us to predict the stoichiometry of the precipitate and gather insights on the prevailing degree of precipitation and solid solution in the system.To validate the efficacy of the model,results were rigorously verified with the relevant experimental results reported in the literature.Approximately 40 new sets of data points generated by the present model were validated through careful comparison and found to be in close agreement with the experimental values with an error limit of less than 7%.

2 Database

The database was meticulously curated through a manual process involving the aggregation of data from an extensive array of over 150 distinct scientific publications.This comprehensive endeavor resulted in the assembly of a dataset comprising more than 500 individual data points as shown in Fig.1.It is known that both electrical and mechanical properties of an alloy depend on heat treatment parameters and vary in the opposite manner with time and temperature.Therefore,optimal combinations of electrical conductivity and mechanical properties(in terms of ultimate tensile strength,UTS,or hardness in Vickers scale,Hv) of the alloys were extracted from the kinetic (timetemperature-extent of transformation) plots for the concerned heat treatment.Consequently,when a machine learning model is built on this data,it should predict the optimized properties for the new alloy system.The database used in this study contains 13 alloying elements,2 dispersoids and porosity as input fields and electrical conductivity and ultimate tensile strength as output fields.Both the maximum and minimum values of every field are provided in Table 1.Data has been collected both for powder metallurgy and casting routes of alloy synthesis;hence a porosity field was added to the database so that the ML model can recognize the data as per the processing route.For the HV-UTS model,a new database was generated using the previous database by selecting only those compositions for which both HVand UTS values were available.In this way,a total of 117 data points were obtained to generate the HV-UTS database.

Figure 1: Database of electrical conductivity and ultimate tensile strength creating the property-space from the randomly selected binary and multi-components alloys for the present optimization study with Cu-base alloys

3 Machine Learning Design and Development

ML is considered a universal function because of its ability to excellently fit and predict non-linear data.It has evolved as a very powerful guiding tool for accelerated materials discovery.Fig.2a shows the versatility and robustness of this pathway over other conventional and non-conventional routes.It acts like an over-bridge that directly combines the input fields to the desired output properties,bypassing the difficulties of extensive experiments,complexities of intricate simulations and the need for rigorous parametric and empirical studies.In addition,its inherent time efficiency,costeffectiveness,and high accuracy have overcome the limitations of the conventional materials design approaches and thereby,serve as a proof-of-concept for developing new high-performance alloys.Fig.2b highlights the advantages and disadvantages of using ML in contrast to the other approaches,providing a ranking based on various descriptors.In this study,the focus is on designing new highconductivity and high-strength copper alloy-based electrical conductors by ML and GA.

Different ML models capture different patterns of data with different degrees of accuracy.Hence,a series of models like the support vector machine(SVM)[56],decision trees[57],random forest[58],XGBoost[59,60],Catboost[61],artificial neural network(ANN)[62],LightGBM[63]and gradientboost regressor [64] have been trained on the dataset.The corresponding accuracies are detailed in Table S1.Among these,the Catboost model[51]outperforms all its counterparts,while the XGBoost[59,60],ANN and random forest [48] approaches have performed well and provided acceptable accuracies.In pursuit of the primary objective—forecasting composition-to-property relationships—”Model 1”is conceived.This framework integrates the Catboost ML model with a Genetic Algorithm(GA).The GA enhances Catboost-based predictions by iteratively refining hyperparameters(detailed in Table 2).The Catboost ML algorithm is rooted in robust gradient-descent tree boosting and relies on tree structures.Among various hyper-parameters,depth,learning rate,iterations,l2 leaf reg,border count,and thread count prove to be the most effective parameters in this study for optimization considering both temporal and computational constraints.GA is a well-established evolutionary technique that has been employed in the present optimization exercise.GA comprises distinct stages of initial population,fitness evaluation,selection,crossover,and mutation.Our GA exercise employs an initial population of 60,terminating after 125 iterations with the crossover probability set at 0.95,mutation at 0.1 and error function at mean absolute error(MAE).These choices drive iterative refinement,yielding optimized hyperparameters and minimizing prediction errors.Since experiments include a great deal of research time and cost,a stacking model (Model 2) was developed before going to experimentation to cross-validate the results of Model 1 as they are in general considered to be decently stable over the domain.Model 2,developed by stacking together the ANN,Xgboost,Catboost and random forest,was threaded by a support vector machine meta-model.To confront the objective of inverse design of property-to-composition prediction,first data augmentation was conducted with the help of Model 1 and then suitable models were developed on top of it for different alloying systems.

Table 2: Accuracies for‘composition-to-property’models

Figure 2: (a) Machine learning serves as a high-throughput screening instrument for materials discovery,effectively circumventing limitations inherent in conventional methods such as experimental trials,computational simulations,and parametric analyses and (b) systematically outlines the advantages and disadvantages associated with various approaches,while employing a quantifiable ranking methodology.This approach substantiates the relative position of ML within the spectrum of methodologies,grounded in efficiency and performance criteria

Fig.3 illustrates the workflow of the procedure adopted to realize the aimed objectives.Initially,data cleaning and feature selection were done recursively to enhance the model accuracy.Then,two robust ML models were developed:Model 1 and Model 2.Model 1 was designed for composition-toproperty prediction while Model 2 was developed for double checking the Model 1 predictions,i.e.,increasing the prediction reliability and screening efficiency by parallel application of both models together.At this stage,grid-optimization was employed to obtain the best performance of Model 2 through hyper-parameter tuning,following which,cyclical feedback was sent back to the feature engineering and data selection section to further refine the data and accuracy at this stage to improve its overall efficiency.After optimizing this feedback loop,GA was employed on the validation set for tuning the ML model hyper-parameters.Here,GA is used to precisely reach the optimal position which is not possible for grid-optimization techniques out of a large number of Catboost hyperparameters.To tackle subsequent objectives of property-to-composition prediction and obtaining the optimized copper alloy compositions,data augmentation becomes crucial as different ML models require a training set and the data from the database may not be sufficient to train the model reliably.Therefore,data was firstly augmented by varying the alloying composition in small increments for every alloy system separately.This procedure culminated in the creation of over 4 million compositions collectively,with Model 1 efficiently assessing the respective properties in a high-throughput fashion.Now to achieve the second objective of exploring the potency of the new copper alloy,all data points were plotted as scatterplots to determine the Pareto front representing the optimized compositions–this will be discussed later.

In the third objective,we need to find the optimized composition of conventional and nonconventional alloys.Firstly,using a defining criterion (DC) (explained in Section 4.3.2),the data is divided into two groups.To obtain the optimized composition,we use a procedure similar to the one used for the second objective.The next objective is the reverse design of the property-tocomposition model.For a given property requirement,one can achieve that objective through various alloying compositions.Thus,the alloy system was fixed;then the data was augmented corresponding to that alloy,and finally,a suitable ML model was deployed on top of it for property-to-composition prediction.For the last objective of precipitate stoichiometry prediction,we use the fact that electrical conductivity is an extremely sensitive property concerning the extent of solid solution and precipitation in the system.Therefore,precipitate stoichiometry is obtained by plotting the electrical conductivity against the atomic ratio of alloying elements.

Figure 3: Flow chart showing the logical sequence for model development and materials selection

4 Results and Discussion

4.1 UTS-Hardness Model(UHM)

Hv,UTS and EC,apart from wear resistance are the three extremely crucial properties of copper alloys for electrical contact applications.Fig.4 shows the variation of Hvas a function of UTS for several Cu-alloys.Obtained through a standard regression analysis,Fig.4 suggests that a linear relationship exists between these two important mechanical properties for the entire series of Cu-alloys covered in this analysis.For ascertaining the overall accuracy of the model,enhancing the reliability of predictions and mitigating potential biases,the training and testing steps were iterated 10 times.Every time,the dataset was divided into 10 subsets,with one distinct part designated for testing,while the remaining nine subsets were employed for training purposes.This strategy of distributed utilization of the dataset was adopted based on the fact that the available dataset size was just 117 data points for UHM.Allocating a larger proportion(90%)for training and a smaller fraction for testing(10%)and then averaging this process over 10 iterations ensured stable,reliable,and better predictive accuracy of the model.In the end,we calculate the mean of the ten accuracy values to get an overall acceptable accuracy value or mean absolute error of the model.The correlation between UTS and Hvis determined by the model with high accuracy(R2value:0.98,MAE value:9.38 HV)to arrive at the following equation:

Figure 4: Variation of Vickers hardness as a function of ultimate tensile strength for copper-based alloys.Here,individual data points represent the relationship between these two independent mechanical properties for the selected alloy compositions and red line marks the best linear regression fit for all the data points

4.2 Composition-to-Property Prediction

As already stated,Model 1 follows GA-optimized Catboost-based ML protocol to correlate the Cu-alloy composition(input)with mechanical(UTS)and electrical(EC)properties(outputs)of the same alloys.Real coded GA exercise was used to improve the accuracy of Model 1 over that of the default Catboost model,leading to a reduced strength and conductivity MAE bar from 43.01 MPa and 3.86%IACS to 35.96 MPa and 3.78%IACS,respectively.Model 2 is a stacking ensemble model which gives a similar accuracy as that of Model 1 as shown in Table 2.The significant performance and precision of this model are due to the simultaneous contribution of multiple optimization exercises based on ANN,random forest,Catboost and Xgboost models.Initially,these four stacking models were trained,and they performed predictions on the test data,which constituted the metadata.Subsequently,the metadata was threaded by a support vector machine(SVM)meta-model to perform the final prediction of the concerned properties.

Figs.5 and 6 illustrate the efficacy of the present models in understanding the compositionproperty correlation in copper alloys.The discontinuous or broken blue line is the linear regression fit (y=x line),which signifies the ideal prediction,i.e.,the predicted trend exactly matches the experimental results.The deviation from this line implies the extent of error associated with an individual predicted data point.Similarly,the points lying above and below the ideal fit(y=x line)imply over and under-prediction by the model,respectively.The continuous and thicker bold lines(red and yellow)represent the best-fit line for the predicted set of data points.The difference between the broken and continuous lines in Figs.5 and 6 shows the deviation of slope between the concerned sets of average trend and hence,signifies the degree of error.

Figure 6: Training and refining exercise for establishing the relationship between the predicted and experimental data concerning(a)electrical conductivity(EC)and(b)ultimate tensile strength(UTS),and the accuracy of the Model 2 based test results in terms of output data of predicted results as a function of experimental data concerning (c) EC and (d) UTS of the selected set of Cu-alloys,respectively.The broken blue line signifies the ideal predictions,the accompanying red line represents the best fit line for the predictions made on training set,and the yellow line indicates the best fit line for the predictions made on the testing set by Model 2.The violin plots(in the inset)show probability distribution curve outside the boundary between predicted and experimental values

The inset for each of the plots in Figs.5 and 6 presents the so-called violin plots in which the central gray region is the box plot and the outside-colored region indicates the population density.Furthermore,the inside central white spot is the median,the thicker grey region is the interquartile region,and the thin gray line represents the rest of the distribution.On each side of the gray line is a kernel density estimation showing the distribution of the data.Wider sections of the violin plot represent a higher probability,i.e.,more members of the population would take that given value.Conversely,the skinnier sections represent a lower probability.Violin plots in Figs.5 and 6 are built up on the difference between predicted and experimental data and indicate that a significantly thicker portion exists near zero that decreases on moving away from this central spot.This fading trend with distance substantiates that the models are effective and work well in predicting the properties and avoiding or minimizing the noise.

It is interesting to note that Model 1 is more accurate in the higher strength-vs.-conductivity domain as shown in Figs.5c and 5d,which is our area of interest for designing high-strength Cubased conductors and contacts.It is also evident from Figs.5d and 6d that the predicted results on UTS from Model 1 are more accurate than those obtained from Model 2 because the test line is nearer to the ideal line and the violin plot is more concentrated near zero.Similarly,Figs.5c and 6c indicate that Model 2 is slightly better in predicting EC values which are qualitatively given in Table 2.The prediction of Model 1 has been thoroughly verified by suitable comparison with relevant data from several published literatures which are unutilized by the model during training and is presented in Table 3 and Table S2.Results are found in close agreement with the available data with errors of less than 7%and 6%for UTS and EC,respectively.

Table 3: Comparison of predicted properties of copper alloys with experimental results

4.3 Optimal Composition Design

4.3.1 Analyzing the Potency of New Systems

There are many Cu-alloys which have never been explored experimentally.To pursue such a possibility,a database has been developed by gathering a substantial amount of useful data on Cu-alloys through numerous research papers,books,patents,etc.After this exercise,any Cu-alloy which is missing in the dataset is categorized as an unexplored alloy.To explore the potency of these new alloys/systems,a Pareto front has been generated with respect to its electrical and mechanical properties.Various new alloys are identified and plotted in the property space as shown in Fig.7 where the top right corner,called the Pareto front is the region of interest as it represents the desired combination of high mechanical strength and high electrical conductivity.Each entry in this Pareto front is non-dominant to one another.Here,instead of using some objective optimization function,best-optimized compositions were manually extracted from the Pareto front composition by leveraging the background of applications.By observing the Pareto front,it is found that Zr and Cr are two excellent candidate elements that provide an excellent combination of electrical conductivity and mechanical strength.Ni and Si increase the strength significantly without much deterioration of electrical conductivity.P is used as the refining agent to rule out the impurities from the system which recovers some of the electrical conductivity.If elements like Mg,Ni,Cr,Si and P are present together,it may result in precipitation due to limited solubility,leading to a better electrical-mechanical property trade-off.Sn and Zn increase the stacking fault energy of the copper alloys[10],resulting in enhanced mechanical strength but lowering of electrical conductivity.Again,by analyzing the Pareto fronts of various alloys,Cu-Ni-P is found to be a possible system combining both good strength and conductivity properties.Similarly,Fig.7 indicates that Cu-Cr-Zr-ZrB2,Cu-Mg-Si-P and Cu-Zr-Fe are alloys that may offer good strength and very high conductivity,Cu-Zr-Ni and Cu-Cr-Zr-Mg-Ni are likely to be of high strength and good conductivity,Cu-Zr-Zn-Sn and Cu-Zr-Zn may combine good strength and high conductivity,Cu-Cr-Zr-Zn-Sn is expected to manifest high strength and high conductivity,Cu-Cr-Zr-Mg-Ni-Si may offer very high strength and good conductivity,Cu-Cr-Zr-Ni-Si-P and Cu-Ni-Si-Fe-P are potential alloys with very high strength and high conductivity,provided these alloys bear the composition as listed in Table 4.Finally,by leveraging the understanding of application requirements,the best compositions of these new alloys are manually chosen from the non-dominant dataset created for the Pareto front and summarized in Table 5.

Table 4: Range of alloying element used for data augmentation and exploration of EC-UTS domain of copper alloys

Table 5: New copper alloy systems with their optimized compositions candidates

Figure 7: (Continued)

In Table 6,the alloys with composition Cu-0.5Cr-0.15Zr,Cu-0.45Cr-0.07Zr,Cu-0.45Cr-0.15Zr-0.06Mg,Cu-0.57Cr-0.1Zr-0.05Mg,Cu-2Fe-0.046P,Cu-0.2Mg-4.08Ni-0.7Si,Cu-2.17Ni-0.4Si-1.08Zn and Cu-0.1Zn-0.125Fe-0.045P are close to the actual compositions of the respective alloys,prepared and subjected to experimental measurement of properties relevant to this study,namely,Cu-0.5Cr-0.15Zr [76],Cu-0.45Cr-0.068Zr [77],Cu-0.43Cr-0.15Zr-0.06Mg [78],Cu-0.6Cr-0.1Zr-0.03Mg [65],Cu-2Fe-0.05P [79],Cu-0.1Mg-4Ni-1Si [73],Cu-1.8Ni-0.4Si-1.1Zn [71] and Cu-0.1Zn-0.15Fe-0.05P[70].It is interesting to note that the electrical conductivity and mechanical strength of the latter set of experimentally prepared alloys closely match with less than 2%error with the properties predicted in the present model.In Table 7,Cu-0.46Cr-0.088Sn(EC:82.5%IACS and UTS:594.9 MPa)can be mapped by individually examining the effect of Cr and Sn through Cu-0.45Cr (EC: 86% IACS and UTS:580 MPa)[80]and Cu-0.1Cr-0.1Sn(EC:90%IACS and UTS:561 MPa)[68]with less than 10%expected error in property.The first alloy from Table 5,i.e.,Cu-0.48Cr-0.1Zr-0.01Mg-2.17Ni is very close to the one reported in the literature with a composition of Cu-0.5Cr-0.15Zr (EC: 87% IACS;UTS: 530 MPa) [76].The supplement of 2.17%Ni addition in the latter alloy is believed to improve the strength but reduce the conductivity to eventually bring the concerned electrical and mechanical properties closer to their predicted values.

Table 6: Conventional copper alloy systems with their optimized compositions candidates

Table 7: Non-conventional copper alloy systems with their optimized compositions candidates

4.3.2 Optimized Composition of Conventional and Non-Conventional Alloys

To differentiate between conventional (more likely to succeed) and non-conventional (not yet explored or proven) alloys,we define those alloys coinciding with less than 1% of the data points in the entire dataspace to be non-conventional systems,while the ones enjoying more than or equal to 1%coincidence with the data points to be conventional alloys in this study.The true potential of these systems is ascertained by optimizing their composition.The actual proportion of elements in each alloy was varied in fine step sizes and thereby,millions of compositions for every Cu-alloy system were generated and their properties were obtained through Model 1.

Then all these compositions were scanned,and every non-dominant composition was plotted forming the Pareto front in the property space of UTS and EC in Fig.8 to identify the best composition and summarized in Tables 6 and 7 for the so-called conventional and non-conventional alloys,respectively.Many optimized compositions mentioned in these tables are also verified by rigorous comparison with the data reported in the literatures having precisely the same or nearby composition.The Pareto front of these alloys has proved to be a powerful tool for obtaining optimized composition and choosing a potential Cu-alloy for a particular application with the desired set of properties.

Figure 8: Pareto front(optimum variation of electrical conductivity as a function of ultimate tensile strength) for (a) and (b) various conventional or standard Cu-alloys,and (c) and (d) the same for various non-conventional or hypothetical Cu-alloys considered in this study

4.3.3 Property-to-Composition Prediction

To explore other possibilities or approaches to composition design or alloy selection,a separate model has been developed to select the target alloy composition for the desired set of properties.To serve this objective,the model needs to map and convert low-dimension input into high-dimension output,which is very difficult for the model to learn with a limited number of available data points,which in turn,may lead to inaccuracy in the result or selection.This problem has largely been overcome by using data augmentation techniques with good accuracy using supplementary data.Since a given broad set of electrical and mechanical properties can be achieved via multiple Cualloy systems (i.e.,there is no unique system),the supplementary data augments multiple candidate alloy systems separately using Model 1.Without data augmentation,accuracy (R2value) remained consistently below 0.5 for all chosen alloy systems.However,upon integrating data augmentation,accuracy significantly surpassed 0.8 for each constituent within every alloying system,as presented in Table S3.Thus,these newly generated datasets were coupled with suitable ML model for property-tocomposition prediction.Several of the predicted alloy compositions using these models,presented in Table 8,are in close agreement with the composition of the experimentally prepared alloys and their relevant properties.

Table 8: Property-to-composition mapping for copper alloys using suitable property-to-composition model

4.3.4 Precipitate Prediction and Solid Solution Analysis

As already stated,the scattering of electrons by solute atoms leads to an increase in resistivity or decrease in electrical conductivity of the solvent matrix or lattice.Obviously,the electrical conductivity of a solid will usually be inversely related to the amount of dissolved alloying elements or solute atoms.More so,this conductivity locally may manifest a strong dependence on the position of the respective solute atoms in the lattice.This is obvious because no two elements with different atomic numbers carry an identical number of electrons and hence,the atomic diameter.That is why,each solute atom in its nearest vicinity creates an expanding (compressive) or collapsing (tensile) strain field around it due to the atomic size mismatch with respect to the concerned solvent or matrix atom.Moreover,local electron density around the solute may change in the case of non-metallic solids with a finite band gap between conduction and valence bands.Thus,the strain field around a solute atom leads to greater electron scattering than what is experienced either in pure solvent lattice or in the case when the solute atoms manage to escape from the matrix in the form of a precipitate phase,coherent or incoherent with the matrix.Therefore,an increase in solute content in the matrix should decrease electrical conductivity.Subsequent precipitation on exceeding the solubility limit should increase the electrical conductivity of the solid as compared to the alloy that could hold the entire amount of solute dissolved in the matrix until its solubility limit.

ML does not directly account for the governing mechanism,physical principles,heat treatment history and processing stages.Nonetheless,ML is a very powerful technique for learning highly complex non-linear interrelations.Therefore,the domain knowledge must be incorporated with model outcomes to extract and interpret the relevant results.Figs.9a–9c precisely confirm this theory at various levels of doping or alloying of the concerned principal alloying or solute element for the selected set of Cu-alloys enjoying mechanical strengthening primarily due to precipitation of intermetallic phases based on the atomic ratio of:(a)Ni:Si,(b)Cr:Si,and(c)Ni:Sn,respectively.The lines representing the three levels of principal alloying addition (2,4 and 5 at.%) in the selected set of alloys lying systematically one above the other in Fig.9 demonstrate that electrical conductivity is inversely related to alloy composition or more precisely,the degree of solute(2,4 and 5 at.%)added and dissolved in the alloy.Furthermore,the precisely identical variation of slope of these lines for each of these alloys as a function of the volume fraction of concerned precipitates(Ni-Si,Cr-Si and Ni-Sn)in each alloy system at all levels of initial solute addition(2,4 and 5 at.%)further substantiates the theory of dependence of electrical conductivity on solute content or microstructure(relative volume fraction of phases in an alloy or phase aggregate)principally arising out of the strain due to solventsolute size mismatch.The marginal deviation in the trend of the curves in Fig.9 can be attributed to some error associated with the model or primary data set.

In the end,an attempt has been made to predict the precipitate stoichiometry (a property very sensitive to material conditions) by taking advantage of electrical conductivity,being a very sensitive property to material conditions.The sharp changes or kinks in the conductivityvs.solute ratios in Fig.9 coincide with the corresponding phase boundary between the phase fields with solid solution without and with precipitates with specific stoichiometry,typical of the intermetallic phases or compounds with fixed stoichiometry,unlike the solid solution with continuously varying solute content.In other words,the formation of such an intermetallic phase or compound should account for the sharp change in electrical conductivity across the composition limit or boundary between the existence of solid solution and the evolution of precipitation of phases with fixed stoichiometry due to a sharp decrease in the degree of electron scattering in the matrix.It is,for this reason,electrical conductivity is considered an appropriate tool or property to determine the solvus point or line of a solid alloy.

Figure 9: Variation of electrical conductivity as a function of the atomic ratio (of 2%,4% and 5%)between the respective principal elements,namely,(a)Ni:Si,(b)Cr:Si,and(c)Ni:Sn constituting the key intermetallic precipitate phases and providing the main source of strengthening in selected Cualloy systems

In Figs.9a–9c,three different lines demarcate different alloying levels(2,4,5 at.%).The same line shows different atomic ratios of solute elements while maintaining the same total alloying content.Fig.9a shows the variation of electrical conductivity with the atomic ratio of Ni:Si.When Ni:Si increases from 0 to 2,the electrical conductivity increases continuously,touching the maxima at the Ni:Si value of 2.This is because more and more Si from the solid solution comes out with some Ni as a precipitate.For Ni:Si >2 electrical conductivity decreases because solute ratios (Ni:Si) are no more in perfect precipitate stoichiometry and extra Ni goes into the solid solution.Therefore,the peak at atomic ratio 2 signifies the precipitation of Ni2Si in the system,aligning with existing literature findings[85,86].Similarly,Figs.9b and 9c show the variation of electrical conductivity with the atomic ratio of Cr:Si and Ni:Sn,respectively.When the Cr:Si and Ni:Sn atomic ratio increases from 0 to 3,electrical conductivity continuously rises because of some precipitation of Cr-Si and Ni-Sn.But after 3,electrical conductivity stops increasing and therefore,this point of optima at 3 signifies the respective precipitation of Cr3Si and Ni3Sn in the system,which is in good agreement with the precipitate stoichiometry reported in the available literatures[85,87].

Figs.10a–10c show the variation of electrical conductivity and ultimate tensile strength with the amount of Ni,Si and Zn present in the concerned alloys.All these elements have one common featurethey all form solid solution with copper in the chosen range(shown in Table 4).This is in contrast with the trend revealed in Fig.9 where systems reach the solubility limit and undergo precipitation.One more noticeable difference between Figs.9 and 10 is that the precipitation demarcates or coincides with the rise in electrical conductivity while the same trend is absent in the case of systems showing existence of only solid solution.As we have already pointed out,in the case of solid solution alloys,continuous increase in strain field due to increase in the alloying element concentration causes a higher degree of scattering of the electron flow,which poses an impediment to the dislocation gliding,and hence,there is continuous decrease of electrical conductivity and increase in ultimate tensile strength.Thus,this correlation presented in Fig.10 can provide qualitative insight into the phases forming in the system when the concerned principal alloying element is more than the solubility limit.

Figure 10: Variation of electrical conductivity (blue line) and ultimate tensile strength (orange line)as a function of the weight percent of the principal alloying elements of(a)Ni,(b)Si,and(c)Zn in selected Cu-alloys

5 Conclusion

In summary,this study showcases the successful application of the machine learning principles for designing copper alloys with an optimum combination of mechanical strength and electrical conductivity,for copper conductors and switch gear applications.To achieve a high success rate for composition-to-property prediction,we developed a genetic algorithm-assisted Catboost machinelearning model with more than 93% accuracy.This model is very promising as it is able to foretell the optimized property combination for the given composition of copper alloy.The results predicted by the present model were thoroughly verified through careful comparison with nearly 40 sets of relevant experimental data reported in the literature;these 40 data points were not used in training or developing this model.Subsequently,several new high-performance conventional and non-conventional copper alloys were proposed by coupling the concept of data augmentation and Pareto front refinement that could offer excellent combination of strength and conductivity as follows:Cu-0.05Cr-0.05Zr-5Ni-0.8Si-0.04P (1002 MPa,60.2% IACS),Cu-0.42Cr-3.0Ni-0.55Si (919 MPa,45% IACS),Cu-3.17Ni-1.44Si-0.022Zn-0.04P (967 MPa,58% IACS),and Cu-3.17Ni-1.75Si-0.04P(987 MPa,56.2% IACS)).Further,to enhance the ease and pace of alloy design,property-tocomposition models were also developed with high accuracy using data augmentation and data segregation.In addition,Model 2 was successfully employed for predicting the evolution of expected precipitates like Cr3Si,Ni2Si and Ni3Sn from a given alloy.

This study further demonstrates that machine learning methods can serve as a proof-of-concept for designing not only high-performance copper alloys but can also provide a generic foundation for the development of any high-performance metallic alloys.However,the accuracy and comprehensiveness of the machine learning model can be further enhanced by incorporating the process,thermal and other parameters as input which in turn can provide better insight into the overall performance of the alloys,including defining the underlying mechanism and microstructural evolution.

Acknowledgement:The authors are thankful for the supercomputing facility(Param Shakti)provided by IIT Kharagpur,established under the National Supercomputing Mission of the Government of India.One of the authors (I.M.) would like to acknowledge partial financial support from DSTSponsored Projects‘JCP’and‘DGL’,the Ministry of Education-Supported SPARC Scheme Project‘LSL_SKI’,and ISRO Sponsored Project‘ONC’.

Funding Statement:The authors received no specific funding for this study.

Author Contributions:Study conception and design: P.Khandelwal,Harshit,I.Manna;data collection: P.Khandelwal;analysis of results: P.Khandelwal,Harshit;interpretation of results: P.Khandelwal,Harshit,I.Manna;draft manuscript preparation: P.Khandelwal,Harshit,I.Manna.All authors reviewed the results and approved the final version of the manuscript.

Availability of Data and Materials:The database is available in the supplementary material.

Conflicts of Interest:The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Supplementary Materials:The supplementary material is available online at https://doi.org/10.32604/cmc.2024.042752.