
Statistical Methods for Comparative Assessment of Quality Attributes in CMC Applications
Explore considerations for selecting statistical methods in demonstrating comparability, challenges in analytical similarity, and industry guidelines for assessing product quality attributes. Learn about technology transfer, method bridging, and more in CMC comparability studies.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Statistical methods for comparative assessment of quality attributes Richard K. Burdick 1 Burdick Statistical Consulting, LLC NCB June 2019
Focus of todays presentation 2 Provide a list of considerations for selecting an appropriate statistical method to demonstrate comparability. Describe additional challenges faced when demonstrating analytical similarity.
Definition of comparability 3 Guidance document ICH Q5E (2004) The demonstration of comparability does not necessarily mean that the quality attributes of the pre-change and post- change product are identical, but that they are highly similar . ..existing knowledge is sufficiently predictive to ensure that any differences in quality attributes have no adverse impact upon safety or efficacy of the drug product.
CMC comparability applications 4 Technology transfer Analytical method transfer Analytical method bridging study Change of contract manufacturer Change of manufacturing scale Scale-up of manufacturing processes in process characterization
Characteristics of CMC comparability studies 5 Comparison of a pre-change to a post-change process. In most comparability applications, the post-change process must be comparable to the pre-change process. Generally, there is no penalty if the post-change process is an improvement over the pre-change process. Typically one has an established knowledge base of the pre- change process. Number of observations for post-change is relatively small in comparison to the pre-change.
Analytical Similarity 6 A related topic to comparability is demonstration of analytical similarity. EMA (2014 guideline) An extensive comparability exercise will be required to demonstrate that the biosimilar has a highly similar quality profile when compared to the reference medicinal product. FDA (2019 draft guidance) Although the scope of ICH Q5E is limited to an assessment of the comparability of a biological product before and after a manufacturing process change made by the same manufacturer, certain general scientific principles described in ICH Q5E are applicable to an assessment of biosimilarity between a proposed product and its reference product. Proposed statistical tools are much the same ones used for comparability studies, but there are some key differences between these two problems
Analytical similarity versus comparability 7 Less knowledge concerning reference product in analytical similarity than the pre-change process in comparability. Capability considerations for a biosimilar process are difficult without known specifications. Correlation among reference drug lots sourced from the same DS lot in analytical similarity present issues for statistical estimation and hypothesis testing.
Learnings from EMA workshop on the draft reflection paper on statistical methodology for the comparative assessment of quality attributes in drug development on 3.-4. May 2018 R.Martijn van der Plas 8 1. Different contexts require separate considerations 2. Clarification of terminology and language 3. Important to understand operating characteristics (OCs) of methods used for comparisons and well understood frameworks to visualize OCs will be important to identify suitable similarity criteria 4. There is no unique optimal similarity criterion 5. Agreement that quality of decision making can be improved
Possible approaches for demonstrating comparability 9 Statistical tests Equivalence of means Equivalence of quantiles ( tail-test , Mielke et al. (2019)) Non-inferiority of variances Non-inferiority of process capability
Possible approaches for demonstrating comparability 10 Heuristic rules % of post-change values must fall within the minimum and maximum range of collected pre-change data. % of post-change values must fall within K pre-change sample standard deviations of pre-change sample mean (FDA quality range). Prediction interval based on post-change values must fall within a tolerance interval of pre-change values (Boulanger, 2016)
How to select an approach? 11 Rather than compare all these approaches today, I want to propose a set of considerations when selecting an approach, and evaluate the quality range approach against this list. Innerbichler (2018) and Stangler (2018) have provided detailed simulation studies that compare several of the proposed methods as they pertain to analytical similarity.
Considerations when selecting a statistical approach to demonstrate comparability 12 1. Protect patients from consequences of concluding comparability when products are not comparable. 2. Protect sponsors from consequences of concluding lack of comparability when products are comparable. 3. Incentivize sponsors to acquire post-change process knowledge, and perhaps reference product knowledge in analytical similarity. 4. Enable decision making with practical sample sizes and reasonable type 1 and type 2 error rates.
Considerations when selecting a statistical approach to demonstrate comparability 13 5. Examine entirety of the process distribution. 6. Consider criticality of attribute and align criteria with subject matter expert (SME) knowledge. 7. Transparency, ease of explanation, and ease in computation for scientists with no formal statistical training.
CMC example 14 Consider a tech transfer where it must be demonstrated that the manufacturing process at a new site (post-change) is comparable to the present manufacturing location (pre- change). One quality attribute of interest is relative potency (%) which has a specification range from 70%-130%. Consider three example data sets of six post-change lots manufactured at the new site compared to 30 lots from the present manufacturing site (reference).
Sample summary statistics 16 Group Reference Post-change 1 Mean Post-change 2 Mean Post-change 3 Mean Reference Post-change 1 Std Dev Post-change 2 Std Dev Post-change 3 Std Dev Sample statistic Relative Potency (%) Mean 100.07 99.83 98.67 82.10 11.30 8.18 20.13 8.78 Std Dev
17 Mean=80, SD=10 OOS=16.0% Mean=100, SD=20 OOS=13.4% Mean=100, SD=10 OOS=0.3% Mean=100, SD=10 OOS=0.3%
Considerations 1 and 2 18 1. Protect patients from consequences of concluding comparability when products are not comparable. 2. Protect sponsors from consequences of concluding lack of comparability when products are comparable.
Types of errors 19 Burden of proof is on the manufacturer (sponsor) Alternative hypothesis: Post-change process is comparable to the pre-change process. So type 1 error is stating processes are comparable when such is not the case. (Patient risk) A type 2 error is failing to demonstrate comparability when processes are comparable. (Sponsor risk) In order to address the first two considerations, one must be able to control both type 1 and type 2 error rates.
An analogous problem 20 This problem is analogous to the problem of acceptance sampling where a lot of material is either accepted or rejected . Each lot can be described based on the percentage of defective units. Type 1 and type 2 error rates associated with consequences of accepting a bad lot or rejecting a good lot. Operating characteristic (OC) curves are useful for selecting sampling designs.
Defining error rates 21 The acceptance sampling problem has a definable quantitative metric of quality (percentage of lot defective) This is not necessarily the case in the present comparability problem. To define type 1 error rate, one must consider a worst case scenario and determine the probability of meeting the comparability criterion under this condition. To define a type 2 error rate, one must consider a best case scenario and determine the probability of meeting the comparability criterion under this condition.
Defining error rates 22 The best case scenario is often where the two processes are identical. The worst case scenario answers the question often asked by regulators: How bad do things have to get before you will reject comparability ? Subject matter experts (SMEs) are critical for defining these scenarios. Selection of scenarios and error rates must align with realistic risk profiles.
How to start? 23 Typically, a sponsor will devise an approach that provides an acceptable probability of passing when the two processes are identical (i.e. under the best case scenario). Consider a quality range approach. Hahn (1969, 1970) provides results that can be used to compute a prediction interval using n pre-process values that will contain all m post-process values for a given level of confidence when the two processes are identical.
Example 24 With 90% confidence, the prediction interval based on n=30 reference values, = = + = = = 2 55 2 55 . 100 07 2 55 11 30 100 07 2 55 11 30 128 9 . . = + 71 3 L U Y Y . S S . . . . R R . . R R will contain allm=6 values in the post-change process when processes are identical. Approximate values based on Bonferroni inequality available in JMP. Type 2 error rate when processes are identical is 100%-90%=10%.
25 Pass Pass Fail
How bad do things have to be before we fail this rule? 26 The type 1 error rate must be defined for a worst case scenario. What is the probability of passing under each simulated model when K=2.55?
Type 1 error rates Mean=80, SD=10 OOS=16.0% 27 Prob pass K=2.55=16% Mean=100, SD=20 OOS=13.4% Prob pass K=2.55=25% Mean=100, SD=10 OOS=0.3% Prob pass K=2.55=90% Mean=100, SD=10 OOS=0.3%
Suppose we think type 1 error is too high? 28 Increase type 2 error to 20% when populations are identical. The 80% prediction interval based on n=30 reference and m=6 post-process values is K=2.21. L=75.1 U=125.0
29 Fail Pass Fail
Type 1 error rates Mean=80, SD=10 OOS=16.0% 30 Prob pass K=2.21=6% Mean=100, SD=20 OOS=13.4% Prob pass K=2.21=15% Mean=100, SD=10 OOS=0.3% Prob pass K=2.21=80% Mean=100, SD=10 OOS=0.3%
Another definition for comparability 31 Hauck et. al. (2009) provide four options for comparing two analytical procedures that measure the same attribute. The previous discussion where two data sets are compared is in alignment with what they call performance equivalence Another option declares a new procedure to be acceptable if it meets expectations without a requirement for direct comparison to a reference data set. The option of acceptability has evolved into the concept of the analytical target profile (ATP) introduced by Barnett et al. (2016). Using this paradigm, it seems easier to construct meaningful comparability criterion.
Acceptability requirement 32 Two processes are comparable if they both meet an acceptability requirement. In manufacturing, process capability is an important process attribute. One metric for process capability is the out of specification (OOS) rate when the process is operating in a state of control.
Possible acceptability requirement 33 : : H H C C 0 1 where is the proportion OOS when process is in control. The test is performed by computing a 100(1- )% upper bound U on and rejecting the null hypothesis if U<C. This has a type 1 error rate of . U canbe computed using the results of Mee (1988) based on the non-central t-distribution.
= : : 13 4 13 4 0 15 . H H . % . % 0 34 1 Mean=100, SD=20 OOS=13.4% Type 1 error using quality range with 80% PI The type 2 error rate under the reference is 11%--a bit less than the 20% with the quality range approach. Mean=100, SD=10 OOS=0.3%
= : : 16 0 16 0 0 06 . H H . % . % 0 35 1 Type 1 error using quality range with 80% PI Mean=80, SD=10 OOS=16.0% The type 2 error rate under the reference is 27%--a bit greater than the 20% with the quality range approach. Mean=100, SD=10 OOS=0.3%
Sample results 36 = : : 16 0 16 0 0 06 . H H . % . % 0 1 Data 1 94% upper bound on is U=4.4% Data 2 94% upper bound on is U=37.4% Data 3 94% upper bound on is U=33.9% Pass as <16% Fail as >16% Fail as >16%
Consideration 3: Incentivize sponsors to acquire post-change process knowledge, and perhaps reference product knowledge in analytical similarity. 37 Post change sample size K Test size in post-change 3 Probability reference passes using quality range Probability reference passes using OOS test 6 8 10 2.21 2.39 2.52 0.06 0.06 0.06 80% 82% 83% 73% 86% 93% Both methods increase providing reward for additional sampling
Consideration 4: Enable decision making with practical sample sizes and reasonable type 1 and type 2 error rates. 38 Practical sample sizes for most post-change processes are from 6-10 depending on the application. Post-change samples of size 3 are not definitive. One needs to be realistic with patient risk. A value less than 0.10 seems unrealistic and even greater values may be reasonable for some attributes with low risk.
Consideration 5: Examine entirety of the process distribution. 39 Individual assessment of means or variances ignores their interrelationship in impacting process capability. A post-change process with a different mean than the reference process may still produce acceptable product if it has lesser variance. If combining equivalence testing of means with non- inferiority of variance, it is necessary to align criteria as noted by Kringle et al. (2001).
Consideration 6: Consider criticality of attribute and align criteria with subject matter expert (SME) knowledge. 40 Regulatory agencies could play a role in establishing these standards.
Consideration 7: Transparency, ease of explanation, and ease in computation by scientists with no formal statistical training 41 Spreadsheet solutions are useful, but should not be limiting if procedures can be performed with user friendly statistical software. Statistical simulation will likely be needed to compute type 1 and type 2 error rates. Meaningful visual displays aligned with the numerical conclusions should always be provided.
Summary 42 Sponsors should report both type 1 and type 2 error rates with associated best case and worst case scenarios. Type 1 and type 2 error rates must be practical given manufacturing constraints on sample size. OOS might be a practical metric if one allows demonstration of acceptability to define comparability.
References Barnett K. L., McGregor P. L., Martin G. P., LeBlond D. J., Weitzel M. L. J., Ermer J., Walfish S., Nethercote P., Gratzl, G.S., Kovacs, E. Analytical target profile: structure and application throughout the analytical lifecycle, Pharm Forum,Vol. 42 (2016), No. 5. 43 Boulanger B., Assessment of analytical biosimilarity: the objective, the challenge and the opportunities , a report on the EFSPI working group, Basel, 2016. European Medicines Agency, Guideline on similar biological medicinal products containing biotechnology-derived proteins as active substance: quality issues (revision 1), (2014). FDA, CBER, CDER, Development of therapeutic protein biosimilars: Comparative analytical assessment and other quality-related considerations, Draft guidance, 2019.
References 44 Hahn, G. J., Factors for calculating two-sided prediction intervals for samples from a normal distribution, Journal of the American Statistical Association, Vol. 64, No. 327 (Sep., 1969), pp.878-888. Hahn, G. J., Additional factors for calculating prediction intervals for samples from a normal distribution, Journal of the American Statistical Association, Vol. 65, No. 332 (Dec., 1970), pp.1668-1676. Hauck W. W., DeStefano A. J., Cecil T. L., Abernethy D. R., Koch W. F., Williams R. L., Acceptable, equivalent, or better: approaches for alternatives to official compendial procedures, Pharm Forum., Vol. 35 (2009), No. 3, pp. 772-778. International Conference on Harmonization (2004) Q5E Comparability of biotechnological/biological products subject to changes in their manufacturing process.
References 45 Innerbichler, F., Comparison of 2 datasets of a quality attribute, IABS 5th Statistics Workshop-Approaches for CMC development and lifecycle management of biotherapeutics and vaccines, November 26 - 28, 2018. Kringle R., Khan-Malek, R., Snikeris, F., A unified approach for design and analysis of transfer studies for analytical methods, Drug Information Journal, Vol. 35 (2001), pp-1271-1288. Mee R. W., Estimation of the percentage of a normal distribution lying outside a specified interval, Communications in Statistics - Theory and Methods, 17:5, (1988), 1465-1479. Mielke, J., Innerbichler, F., Schiestl, M., Ballarini, N. M., Jones, B. The assessment of quality attributes for biosimilars: A statistical perspective on current practice and a proposal, The AAPS Journal (2019) 21:7,DOI: 10.1208/s12248-018-0275-9.
References 46 Stangler. T., Performance characteristics of quality range methods and equivalence testing in the comparative assessment of quality attributes, EMA workshop on draft reflection paper on statistical methodology for the comparative assessment of quality attributes in drug development, May 2018. van der Plas, R. M., Workshop on the draft reflection paper on statistical methodology for the comparative assessment of quality attributes in drug development on 3.-4. May 2018