Sample size re-estimation without un-blinding for time-to-event outcomes in oncology clinical trials

2018-03-27LihongHuangJianlingBaiHaoYuFengChen

THE JOURNAL OF BIOMEDICAL RESEARCH 2018年1期

Lihong Huang,Jianling Bai,Hao Yu,Feng Chen,2,✉

1Department of Biostatistics,School of Public Health,Nanjing Medical University,Nanjing,Jiangsu 211166,China;

2Ministry of Education Key Laboratory for Modern Toxicology,School of Public Health,Nanjing Medical University,Nanjing,Jiangsu 211166,China.

Introduction

Given the life-threatening nature and the unmet medical needs of many types of cancer,the drug development process in the field of oncology should be accelerated.To this end,many clinical trial approaches have been proposed.The sample size required for a clinical trial should be suf ficiently large to provide a reliable answer to the questions addressed[1],and the method by which the sample size is calculated should be provided in the protocol.This method is the most basic requirement for planning all studies because it is critical to the success of a study and pertains to budget considerations.For fixed sample size designs,there is a risk that expected trial outcomes may not obtain adequate power because some uncertainty is usually associated with the parameters in the planning phase.This de ficit can be remedied by re-estimating the parameters during the interim analysis and modifying the initially planned sample size if necessary[2–3].

Although various methods have been proposed and used for sample size,a basic debate remains:Should the interim data be examined with the treatment group in a blinded or un-blinded manner?Regulatory authorities certainly favor blinded sample size reassessment(BSSR)(CPMP Working Party on Ef ficacy of Medicinal Products[4];ICH-E9 Expert Working Group[1];European Medicines Agency/Committee for Medicinal Products for Human Use[5])because it better preserves study integrity.Fortunately,the work of Gould and Shih[6]has shown that un-blinding is not necessary to ef ficiently estimate within-group variance.

Nevertheless,previous studies of sample size reestimation addressed various statistical and practical aspects of this approach,such as BSSR in noninferiority and equivalence trials with normally distributed outcome variables and hypotheses formulated in terms of the ratio and difference of means[7],BSSR in multi-armed clinical trials when the outcome variable is normally distributed[8],BSSR with negative binomial counts in superiority and non-inferiority trials[9],and BSSR with count data in multiple sclerosis[10],etc.

However,signi ficantly longer progression-free survival(PFS),overall survival(OS)and time to progression(TTP)are well known to be widely used in oncology studies as primary endpoints to evaluate the ef ficacy of treatment.Thus,the response variable of most oncology clinical trials is survival time.To the best of our knowledge,the use of BSSR in this context has rarely been reported.Here,we tried to develop an EM approach for BSSR with exponentially distributed outcomes.The performance and applicability of this procedure are described based on several simulation studies.

Materials and methods

Derived EM algorithm

The sample-size formula for an oncology clinical trial can be simpli fied if it is expressed as the number of deaths required rather than the number of patients.Suppose that a two-sided test will be performed with a signi ficance level of α/2 and a power of 1-β for a hazard ratio Δ.Let z1-a/2and z1-βbe the 1-α/2 and 1-β percentiles of the normal distribution,respectively,and let PAand PBbe the proportion of the patients randomized to treatments A and B,respectively.Then,the total number of deaths required is given by the following expression[11]:

In general,only Δ in the above formula is unknown and needs to be estimated based on previous studies.However,if Δ is much larger or smaller than estimated,the sample size will need to be re-estimated without breaking the randomization codes.Gould and Shih proposed a EM algorithm-based procedure for blinded variance estimation for normally distributed endpoints[6,12];we extended this EM algorithm for Δ estimation to exponentially distributed endpoints without un-blinding the treatment group at the interim stage.

As the treatments are not identi fied,any interim observation,xi,i=1,…,n,could be in either treatment group such that the treatment assignments are"missing at random".Letτidenote the treatment group indicator,e.g.,τi=1(0)indicates that sample i is in treatment group 1(group 2); τ1,…τnare independent random variables with P(τi=1)= θ.The density function of the exponential distribution is f（t）= λe–λt,and λ is the scale parameter.Givenτi,xi(i=1,…,n)is distributed as follows,with density

Therefore,the expression for the expectedτigiven xiis

Then,the log-likelihood of the interim observations is L（λ1,λ2|xi,τi）

After taking the partial derivative of the above loglikelihood,the maximum likelihood estimates ofλ1/λ2are

The EM algorithm for estimating λ1/λ2proceeds as follows.For example,if θ is assumed to be 0.5,then the"E"step consists of substituting the"initial"estimates of λ1and λ2into formula(3)to obtain provisional values for the expected value ofτi."M"consists of obtaining maximum likelihood estimates of λ1/λ2according to(5)after replacingτiin(5)with the provisional expectations.The"E"and"M"steps are repeated until the value of λ1/λ2stabilizes.

Simulations

We use simulations to evaluate performances of the proposed procedures.Speci fically,various Δ estimators were compared for different scenarios.The censoring was notconsidered in the following simulations because the derived EM algorithm was used to re-estimate the number of events.For the EM algorithm,the recursive computation was continued until successive estimates ofλ1and λ2differed by less than 0.001.Besides,we also designed a simulated clinical trial to investigate the power and illustrate BSSR procedure.

Firstly,the sample size requirement of the above EM algorithm was investigated.We considered the situation of equal samples for each distribution,population parameters were set asλ1=0.1,0.12,0.15,0.2,0.25,0.3 andλ2=0.1,and several scenarios for the number of events per group differed from 5 to 800.The initial values were set to 1.0 and 0.8.In addition,the complete proportion of the total sample size in interim analysis is also an impact factor related to sample size.We de fined Δ=0.66,2.0,3.0 to do investigation.It was set α =0.05,power=90%,we calculated the number of events using(1)for each scenario with 100%complete proportion,and the estimated values are listed in column"n"in the first line of Table 1.The other"n"cells are calculated according to the de fined complete proportion;for example,the 80%complete proportion means 80%of the total number of events are completed in interim stage,n is 103 for scenarioλ1=0.10 andλ2=0.15 which is calculated by 129*80%.

Secondly,as the negligible impact of initialization on the EM estimation procedure,we investigated its effects by calculating the Δ estimates according to the above EM procedure for 1,000 repeated simulations with different values of the initialization constants of λ*1and λ*2.We selected equal number of events per group(n=50)for the internal pilot study.

Thirdly,the balanced design had been considered in most sample size estimation studies.Here,we investigated the impact of the sample allocation ratio on the above EM procedure.It was set with α=0.05,power=90%,and we calculated the number of events using(1)for each allocation ratio and the estimated values listed in column"n"of Table 2.As some cells of"n"had the n1or n2values which were smaller than 20,considering the impact of the sample size,the number of events was expanded to at least 50 for one of the groups to obtain column"n*"in Table 2.Δ was then re-estimated based on"n"and"n*",respectively.

In a simulation study,we investigated the power and BSSR procedure with dummy randomized clinical trials in which the survival data of the two treatment groups were compared.We obtained the independent and identically distributed time-to-event observations from exponential distributions and the censored data from uniform distribution.

In our design,an initial sample size(n0)was carried out based on assumed parameters.Although this sample constituted the final sample size for the fixed design,the sample size could be adjusted using BSSR based on half of the initially sample size,i.e.,50%information time.The following parameter values were considered:assumed median survival time T=10 and 5 for the treatment group and the control group,respectively;therefore,the scale parameters λ of the exponential distributions were calculated to be 0.07 and 0.14 based on the formula λ=log(2)/T.Two different scenarios were considered for the true median survival time,T'=10,5 and T'=12,5.Furthermore,the design was balanced and the signi ficance level and target power were the usual α =0.05(two-sided)and 1-β=0.9.

The number of failures for the fixed design was estimated to be 45[13–14]per group based on the assumed parameters,and the internal pilot study included 23 failures per group for an information time of 50%.The assumed censor rates were 0,20%,and 40%,and then the corresponding event rates were 100%,80%,and 60%,respectively.Accordingly,these values resulted in fixed sample sizes(n0)of 45,56,and 75,respectively,based on sample size=number of events/(event rate).The true parameters were used to generate simulated data with a fixed sample size(n0).According to EM algorithm,the blinded re-estimatedsample size(n1)was based on the 23(the half of 45)events from an assumed internal pilot study which is de finedas 50%information time of the integrity trial[15].Subsequently,1,000 trails were simulated from each scenario.

Table 1 EM re-estimation of Δ with different completed proportions on 1,000 runs

Table 2 EM re-estimation of Δ with different sample allocation ratios on 1,000 runs

Results

Sample size requirement

Fig.1 shows the EM re-estimation of Δ for different number of events;1,000 simulation replications were performed for each situation.The estimation results stabilized as the number of events increased.All results were overestimated when the number of events per group was less than 30.The smaller Δ,the higher sample size was required.More than 100 sample sizes was needed when Δ was 1.5.For Δ =1.0,there was about 16.05%overestimation even if sample size reached 800,this result was similar with Xie's research(13.54%)[16].

Fig.1 EM re-estimation of Δ with different sample sizes

Table 1 also shows that the estimates were impacted by event numbers(sample size).For λ1=0.10 and λ2=0.15,the estimates were all acceptable even if the completed proportion was 20%,which nearly satis fied the sample size requirement of this EM algorithm.On the contrary,the estimates signi ficantly deviated from the real values for λ1=0.3 and λ2=0.1 if the completed proportion was less than 80%due to the number of events was insuf ficient for the EM algorithm.

Initial values

The simulation results are presented in Table 3.Speci fically,the estimates obtained from the EM procedure depended on the initialization.It shows that the Δ estimates exceeded 1 if λ*1＞λ*2and were less than 1 if λ*2＜ λ*1.Fortunately,the estimated Δ values were very close to the true values when the initial values satis fied λ*1＞λ*2with the true values λ1＞λ2,and λ*1＜λ*2with λ1＜λ2.The estimated number of the required events using(1)was equal for both Δ = λ1/λ2and Δ = λ2/λ1.Therefore,the choice of λ*1＞λ*2or λ*1＜λ*2did not affect the re-estimation of the number of events.Table 3 also shows the overestimation results for Δ =1(λ1=0.2,λ2=0.2),and the choice of initial values did not affect these overestimated results.

Sample allocation ratio

Table 2 results indicate that the EM estimates of Δ were impacted by the sample allocation ratio.The balanced design produced the best estimates.The unbalanced designed estimates,including 2:1,1:2,3:1 and 1:3,were slightly larger or smaller,and the sample size difference between the two groups directly correlated with the estimated values.More unbalance would get more biased estimation although the number of events increased to at least 50 for one group in"n*"column.

Table 3 EM re-estimation of Δ with different initial values on 1,000 runs

Example(simulated)

Table 4 shows the statistical power tested by the logrank test and exponential regression for fixeddesign and sample sizes re-estimated adjusted design.For the true median survival times of T'=12 and 5,the original sample sizes were 45 for a 100%event rate,56 for an 80%event rate and 75 for a 60%event rate,as shown in column"n0".Thus,these sample sizes were overfull to obtain 90%power for the log-rank test or exponential regression.Therefore,the power of the log-rank test and exponential regression exceeded 90%signi ficantly.After a Δ re-estimation using the above EM algorithm,the number of events was re-calculated.Based on the pre-de fined event rates(60%,80%and 100%),the corresponding sample sizes were re-calculated and were shown in column"n1".From the output,the reestimated sample sizes"n1"were near the actual requirement,the power for the adjusted design was closer to 90%for a 100%event rate.

For true median survival times of T'=10 and 5,"n0"was the correct required sample size,and the values of re-estimated sample sizes"n1"were similar to the"n0"values;the power for both fixed design and adjusted design was close to 90%for a 100%event rate.The variation details of the simulated EM estimated values of λ1and λ2for each scenario are presented in Fig.2.

Notably,the power for both the log-rank test and exponential regression inversely correlated with the event rate in each scenario because the sample size calculation was based on the event numbers/(event rate).Lower event rates required larger samples,which increased the power of the test.

Discussion

The topic of sample size reassessment during an ongoing trial has become very popular in recent years.The in flation of the type I error rate and the loss of power has long been an intractable problem for sample size adjustment in an internal pilot study with adaptive design.In the ICH E9 guidelines,it is re flected by the following requirement for planned sample size adjustment:"The step taken to preserve blindness and consequences,if any,for the type I error and the width of con fidence intervals should be explained".The calculation of the actual type I error rate for the blinded case was previously derived for the t-test situation[17],and the results showed that the nominal in flation in the type I error rate did not signi ficantly differ for the blinded sample size recalculation in the unrestricted design.Over the last decade,a number of studies had addressed the problem of BSSR for an ongoing clinical trial.These documents noted that the assumptions of sample size calculation should be reassessed with the blinded data and that the effect on the type I error rate can be controlled[18].To our knowledge,this work is restricted to the normally or binomially distributed data.

Cancer is currently one of the major diseases affecting human health,and anti-cancer drug becomes the focus of research in the pharmaceutical industry.Survival time-related outcomes such as PFS and OS are generally used as primary endpoints in oncology studies to assess ef ficacy.Thus,oncology trials usually require long follow-up periods and include a planned interim analysis,and sample size re-estimation is also essential.In this paper,we tried to extend BSSR method for exponentially distributed survival data based on EM algorithm.

The blinded data of an ongoing trial is from a mixture of two populations;one for the treatment group and the other for the control group.EM method described here is applied to estimate model parameters of the mixture distributions and therefore assess the hazard ratio.The derived EM estimation only can be used to re-estimate the number of events for oncology trials,and the censoring is ignored.Normally,it is acceptable in the clinical trial because the censor rate is always unknown and only assumed in the design stage.With the re-

estimated number of events in the interim stage,the total censor rate of the two treatments which is available without breaking blind,could be used to do sample size re-estimation,because sample size=the re-estimated number of events/(1-interim stage censor rate).On the other hand,only exponential distribution has been considered in our research,and Weibull distribution needs to be investigated in the future since it is more widely applied in survival analysis.

Table 4 Simulated power for fixed and adjusted designs

Fig.2 Kernel Density of λ1and λ2estimation.A and C are for Scenario 1-3 in Table 4,B and D are for Scenario 4-6 in Table 4.

Our studies show that estimates from the EM method are highly variable,which coincides with the literature on interim analysis of treatment effects with blinded data and a normally distributed endpoint.Besides,the relatively small hazard ratios(Δ＜1.2)are overestimated according to this EM method.

Speci fically,for the hazard ratios(Δ≥1.2),the EM estimation shown here is subject to a sample size requirement and the stability of the estimation results directly correlated with the number of events.The initial values for the parameters directly affectthe convergence of the algorithm and the estimated results.The simulation shows that the initial values should be carefully selected,and the calculation of the optimum initial values requires further research.Moreover,the EM estimation described here is more suitable for a balanced design.

Due to the variation and overestimation of smaller hazard ratios,no reliable inference can be made on sample size re-estimation in our studies.The results from this paper provide useful information to steer the practitioners in this field from repeating the same endeavor.This should be of some relief to health authorities.

This research was supported by the National Natural Science Foundation of China(81273184),and the National Natural Science Foundation of China Grant for Young Scientists(81302512).We would like to thank the referees for their comments that greatly helped us improve the manuscript.

[1] ICH.ICH Harmonised Tripartite Guideline.Statistical principles for clinical trials.International Conference on Harmonisation E9 Expert Working Group[J].Stat Med,1999,18(15):1905–1942.

[2] Posch M,Bauer P.Interim analysis and sample size reassessment[J].Biometrics,2000,56(4):1170–1176.

[3] Proschan MA.Sample size re-estimation in clinical trials[J].Biom J,2009,51(2):348–357.

[4] CPMP.Biostatistical methodology in clinical trials in applications for marketing authorizations for medicinal products.CPMP Working Party on Ef ficacy of Medicinal Products Note for Guidance III/3630/92-EN[J].Stat Med,1995,14(15):1659–1682.

[5] EMEA/CHMP.Re flection Paper on Methodological Issues in Con firmatory Clinical Trials with Flexible Design and Analysis Plan[J].Available online at http://wwwemeaeuint.2006.

[6] Gould AL,Shih WJ.Sample size re-estimation without unblinding for normally distributed outcomes with unknown variance[J].Commun Stat Theory Methods,1992,21(10):2833–2853.

[7] Friede T,Kieser M.Blinded sample size reassessment in noninferiority and equivalence trials[J].Stat Med,2003,22(6):995–1007.

[8] Kieser M,Friede T.Blinded sample size Reestimation in multiarmed clinical trials[J].Drug Inf J,2000,34(2):455–460.

[9] Friede T,Schmidli H.Blinded sample size reestimation with negative binomial counts in superiority and non-inferiority trials[J].Methods Inf Med,2010,49(6):618–624.

[10]Friede T,Schmidli H.Blinded sample size reestimation with count data:methods and applications in multiple sclerosis[J].Stat Med,2010,29(10):1145–1156.

[11]Schoenfeld DA.Sample-size formula for the proportionalhazards regression model[J].Biometrics,1983,39(2):499–503.

[12]Chang M.Adaptive design theory and implementation using SAS and R[M].Boca Raton:Chapman&Hall/CRC,2008.

[13]Desu MM,Raghavarao D.Sample size methodology[M].Boston:Academic Press;1990.

[14]Bain LJ,Engelhardt M.Statistical analysis of reliability and life-testing models:theory and methods[M].2nd ed.New York:M.Dekker;1991.

[15]Birkett MA,Day SJ.Internal pilot studies for estimating sample size[J].Stat Med,1994,13(23-24):2455–2463.

[16]Xie J,Quan H,Zhang J.Blinded assessment of treatment effects for survival endpoint in an ongoing trial[J].Pharm Stat,2012,11(3):204–213.

[17]Friede T,Kieser M.Sample size recalculation in internal pilot study designs:a review[J].Biom J,2006,48(4):537–555.

[18]Kieser M,Friede T.Simple procedures for blinded sample size adjustment that do not affect the type I error rate[J].Stat Med,2003,22(23):3571–3581.

THE JOURNAL OF BIOMEDICAL RESEARCH

2018年1期