1,757
Views
23
CrossRef citations to date
0
Altmetric
Research Article

An inter-laboratory retrospective analysis of immunotoxicological endpoints in non-human primates: T-cell-dependent antibody responses

, , , , , , & show all
Pages 238-250 | Received 31 Mar 2011, Accepted 04 May 2011, Published online: 21 Jun 2011

Abstract

The Immunotoxicology Technical Committee of HESI sponsored a retrospective analysis of T-cell-dependent antibody responses in non-human primates (NHP). Antibody responses to keyhole limpet hemocyanin (KLH), tetanus toxoid (TT), and/or sheep red blood cells (SRBC) in 178 NHP (from 8 sponsors, 13 testing sites, 30 studies) were statistically analyzed. Rates of positive or negative anti-KLH, -TT, and -SRBC primary and secondary IgM and IgG responses were compared. The influence of gender, country of origin, and previous immunization with a different antigen on response rate and kinetics of anti-KLH and anti-TT responses were analyzed. In addition, the magnitude of the antibody responses and the impact of the above-mentioned factors were analyzed. In addition, based upon the inter-individual variability of the peak response values, power calculations were conducted. The analysis demonstrated that the rates of positive responses were similar between the two genders, were high for KLH, SRBC, and TT challenges by 21 days following immunization (87, 100, and 84%, respectively, for IgGs) and did not include statistically significant differences based on NHP country of origin. Mean peak secondary responses were greater than peak primary responses; the magnitude of the response to KLH was increased by incomplete Freund’s adjuvant (IFA). Gender had little effect on the magnitude and variability of these responses. KLH and TT were associated with similar inter-animal variability, whereas in some situations KLH responses were less variable than responses to SRBC. The data suggested that inter-animal variability with KLH was similar with or without IFA. Power analysis illustrated that animal group sizes of typical standard toxicology studies (generally ≤ 4/sex) are likely to detect only fairly large treatment effects. However, combining males and females, when appropriate, will improve the power: an N of 8 to 12 could detect ≤ 3.1-fold differences in anti-KLH IgG responses.

Introduction

The T-cell-dependent antibody response (TDAR) to T-cell-dependent antigens is a functional assay used in immunopharmacology and immunotoxicology to assess the ability of the species of interest to mount a specific antibody (IgM and/or IgG) response to immunization (Luster et al., Citation1992, Citation1993; reviewed by Piccotti, Citation2008). In toxicology regulatory guidance documents, the TDAR is considered as one of the functional assays of choice to evaluate potential immunotoxicity of investigational new drugs (CPMP, Citation2000; FDA, Citation2002; ICH, Citation2006). While this assay is extensively used in rodents, the significant increase in the number of biopharmaceuticals being developed and lacking cross-reactivity in rodents has been associated with a need for non-human primate immunotoxicology models.

Studies have been dedicated to the optimization of the immunization protocols for conducting a TDAR assay in non-human primates (NHP) (Piccotti et al., Citation2005; Caldwell et al., Citation2007; Haggerty, Citation2007; Kirk et al., Citation2008; Tichenor et al, Citation2010). However, the multiple variables potentially associated with such studies render the optimization of such protocols challenging. In that context, the Immunotoxicology Technical Committee of the International Life Sciences Institute (ILSI) Health and Environmental Sciences Institute (HESI) sponsored a retrospective inter-laboratory study of T-cell-dependent antibody responses in NHP. This work is intended to provide perspective on the impact of study design on TDAR responses in NHP for potential immunotoxicological/immunopharmacological assessments of pharmaceuticals. Specific questions addressed in this retrospective study are as follows:

  • What are the rates of positive (i.e., detectable) primary and secondary responses to keyhole limpet hemocyanin (KLH), sheep red blood cell (SRBC) and tetanus toxoid (TT) immunizations in cynomolgus macaques?

  • Which factors influence the rates of positive primary responses? The analysis included animal characteristics (gender, country of origin) and immunization characteristics (type of challenge, simple vs. multiple antigens and primary vs. secondary challenge).

  • How different are the mean responses between peak primary vs. peak secondary responses?

  • Are there differences between male and female NHP in their mean peak responses?

  • What is the inter-animal variability of the peak responses and which factors are associated with differences in the variability? The factors considered in this question include the factors included above for analysis of rates of positive responses.

  • What differences in peak responses are likely to be detected in future studies in cynomolgus macaques, given immunization protocols and sample size?

Methods

TDAR variables

Eight sponsors submitted data for this work. The data comprised a total of 30 studies conducted at thirteen different testing sites. Each sponsor provided information on the study design used in each TDAR study, including: (i) test system: NHP (cynomolgus macaque) origin (China, Indonesia, Mauritius, Philippines, unknown), strain, gender, age, number of animals per group, (ii) antigen (KLH, TT, SRBC), dose, presence or absence of adjuvant, timing of immunization (when several antigens were used), and (iii) immune response readout (amount of antigen-specific IgM or IgG obtained from immunoassays). Twenty-six hundred sample results from 178 untreated or vehicle-treated immunized cynomolgus macaques were evaluated.

TDAR endpoints

Sponsors utilized different ELISA methodologies to measure antigen-specific antibody responses in serum samples collected before and after immunization with KLH, TT, or SRBC. Analyses were conducted by measuring serial dilutions of each sample to determine an antibody titer or by analyzing specific dilutions of each sample against a reference sample. Depending on the specific assay conditions for each study, results were expressed in different units (e.g., mass concentration, cut-point titer (derived from highest serum dilution providing a signal above the limit of quantitation), center-point titer (derived from the serum dilution associated with half the assay maximum optical density). In order to analyze the impact of several factors on the outcome of TDAR assays, two types of TDAR endpoints were considered: the presence of a positive (i.e., detectable) antibody response over time, and the magnitude of the response observed.

Positive antibody response

To overcome numerical differences in values derived from different immunoassay methodologies to measure antigen-specific IgM and IgG responses (associated with different units of measurement), each sample was qualified as negative or positive on the basis of the limit of quantification for each particular assay. A sample value greater than the limit of quantification of the assay utilized was defined as “positive”. A total of 1435 samples from 146 NHP and 22 studies were classified as positive or negative. In order to account for differences in the timing of sampling days for different studies, the positive responses were defined in reference to the study Days 10 and 21 (with the challenge occurring on Day 1). These two time-points best bracketed available time-points from all studies and were consistent with anticipated durations of development of antibody responses to neoantigens (Feldmann, Citation1996). For each monkey and antigen tested, the presence of one or more positive samples within 10 days, or within 21 days, of the antigen challenge led to a positive classification for that time period. Animals from studies, where no response data were sampled by Day 10, were used in the Day 21 dataset but excluded from the day 10 dataset.

TDAR value at peak response

For a given study, challenge (antigen, primary or secondary immunization) and antigen-specific antibody isotype (IgM or IgG), the mean log TDAR value for each post-immunization sampling day was calculated. The day with the largest mean was designated as the peak response day for that study, challenge and antibody. For all animals in a study, the response on the designated peak response day was used in the peak response analysis. In cases when multiple sampling days were tied for the maximum mean log TDAR response, the earliest of these days was considered the peak response day. Only data from antibody responses with at least 2 post-immunization time-points and with the more common measurement units (center-point titer, cut-point titer, estimated titer and mass concentrations) were used. The total sample size for this endpoint was 132 NHP (313 peak samples) from 5 sponsors (9 sites and 21 studies). The log peak TDAR values were analyzed rather than the original values for two reasons: (i) immune response readouts for most of the units were skewed; and, (ii) use of the log values allowed for inclusion of results measured in different units within a single analysis. Under the assumption that relative (percent) changes are comparable between different units, the log-transformation enabled pooling and comparison of data measured by different units.

Statistical methods

Positive response analysis

Descriptive statistics for positive response rates (the number and the rate) were presented by group as N (%) per group. P-values for comparison of the rates between groups were calculated by generalized linear mixed models (GLMM), which accounted for the correlation of observations within the same study. In a few situations, the model did not converge and the p-values comparing groups were calculated either by the chi-squared test or by Fisher’s exact test. These two tests provide approximate p-values for the comparisons; the p-values may be anti-conservative (i.e., too small) because they did not take account of the study effect.

Peak response analysis

Mean value of the peak response was compared between the primary and the secondary response, with and without adjuvant, and between the two genders. In addition, the inter-animal variability of the peak response was compared between the primary and the secondary response, between the two genders, among challenge types and for different monkey origins. Heuristically, there were two different kinds of comparisons. For two factors (primary vs. secondary response and gender), their values varied within some studies and thus direct comparisons were possible within these studies. Effects of each of the two factors (expressed as either the difference of mean values between the two groups or the ratio of standard deviations [SD] for the two groups) are presented graphically for each study. A companion pooled result and a p-value (likelihood ratio test) for the comparison between two groups were calculated by a linear mixed model with a random effect for the study. The remaining factors (e.g., challenge type, single vs. multiple challenge and species origin) were always constant within a study and therefore the effect of these factors on the peak variability had to be assessed by comparing different studies. The inter-animal variability, expressed by the standard deviation (SD) on the loge scale and by the same SD back-transformed to the original scale (eSD) is presented for studies sharing the same NHP origin and challenge characteristics.

The analysis of the inter-animal variability of peak response was restricted to studies that utilized any of the following four measurement units: center-point titer, cut-point titer, estimated titer and mass concentration. Each of these units had a different quantitative meaning, and a direct comparison of numeric values or log numeric values between different units drawn from different studies was problematic or impossible. Thus, for the analysis of the impact of type of challenge or of animal origin on inter-animal variability, each analysis was restricted to a group of studies with the same measurement unit.

There were four such groups of studies where analysis was feasible, corresponding to the four measurement units just noted. This restriction on measurement units and studies was not necessary for the analysis of the impact on inter-animal variability of (i) primary vs. secondary response or (ii) gender. For each of these two factors, there was a substantial collection of studies with both genders (or both primary and secondary response) assessed in each study. In such datasets, a mixture of studies with different measurement units can be employed for variability analysis, when (i) the log transform is used, and (ii) the comparison of variability is based on the within-study ratio of standard deviations between groups being compared. The “inter-animal variability” as analyzed in this study incorporates both the intra- and inter-animal components. Data was not available to study each component of variation separately. Bartlett’s test and a likelihood ratio test (from a linear mixed model) were used to statistically test the equality in inter-animal variability among groups.

Power calculations

A group of studies was chosen with NHP origin and challenge characteristics that were common in these data and the inter-animal variability (SD) was calculated for these study parameters for use in the power calculations. The power calculations assumed a two-group comparison of peak response means by a two-sided two-sample t-test on the loge data (one sample per animal, 80% power and p < 0.05 for statistical significance). The sample size was varied from 3 to 12 NHP per group. The detectable difference for each sample size is presented on the original scale as fold differences (simply edifflog, where difflog is the detectable difference on the loge scale). For example, a 2.7-fold difference between means for two groups means that the geometric mean for one group is 2.7-times as large as the geometric mean for the other group. “Fold difference” and “ratio” may be used interchangeably. All calculations were carried out in R, version 2.11.0 (Vienna, Austria). A p < 0.05 was used to denote statistical significance.

Results

NHP and challenges characteristics

shows the number of NHP classified by their characteristics (gender and origin) and the characteristics of the immunization conditions (primary or secondary immunization, antigen, adjuvant) and TDAR assay readout unit.

Table 1.  Characteristics of NHP, immunization conditions, and TDAR assay readout units*.

NHP challenged by KLH had the highest representation in this analysis, with 154 of the 178 total animals (87%). Fewer NHP were immunized with TT (61 of 178, 34%), and SRBC (22 of 178, 12%). Percentages add to more than 100%, because 35% of the NHP received two or more immunogens. Most of the studies had approximately equal numbers of males and females, resulting in an approximately equal number for the two genders overall (86 of 178, 48% female; 92 of 178, 52% male). The most common country of origin was China (88 of 178, 49%). The remaining origins were Indonesia, Mauritius and The Philippines. While the age of the animals was known and reported to be within 1.6–7 years of age for the majority of the NHP (83%), it was unknown for 17% of the animals and study of age as a risk factor was not possible in this analysis. The sample sizes available for the positive response and for the peak response analyses were defined by study design features described in the methods section and are shown in .

Positive response analysis

Effect of antigen

presents the estimated rates of positive primary response by antigen and antibody isotype. Responses were generally consistent between 10 and 21 days following immunization on Day 1 for each of the antibody isotypes. The 10-day antigen-specific IgM antibody response showed the most diversity. For the IgM antibody, the challenge by KLH resulted in the highest positive response rates at 10 days (93%) compared to SRBC (75%) and TT (50%). For the IgG antibody response, the challenge by SRBC resulted in the highest positive response rates at 10 days (100%) compared to KLH (79%) and TT (73%). The differences between the three groups, however, were not statistically significant for either of the two antibody isotypes or the two intervals (10 and 21 days). presents the estimated rates of positive secondary response by antigen and antibody isotype. Responses rates were high (92 to 100%) and similar (p = 0.14) for anti-KLH and anti-TT secondary IgG responses 10 and 21 days following a secondary immunization. IgM was rarely measured following secondary antigen challenge, so no comparison was made between antigens for this isotype.

Table 2A.  Positive response rate by antigen: Primary response only.

Table 2B.  Positive response rate by antigen: Secondary response only.

Effect of adjuvants, number of antigens, gender and NHP origin

Rates of positive response were compared between presence vs. absence of adjuvant, single vs. multiple antigens, by gender and NHP country of origin. Comparison for SRBC challenges was very limited. All SRBC challenges were for 12 Mauritius NHP (6 males and 6 females), with primary challenges only (i.e., no secondary SRBC challenges were available). All of the 12 animals had positive IgG response on or before Day 10. Four of the animals (2 males and 2 females) had data on the IgM response: three were positive (two males and one female) and one was negative (a female). For KLH and TT challenges, more comparisons were feasible given the larger sample sizes. presents these comparisons for KLH and TT challenges.

Table 3.  Impact of risk factors on the rate of positive primary response, stratified by antigen, antibody and duration. Primary response only.

There was a statistically significant difference in the comparison of response to single vs. multiple antigens at 10 and 21 days for the TT challenge. None of the other comparisons resulted in statistically significant differences in the rates of positive primary responses for IgM or IgG in either the 10-day or 21-day post-immunization intervals.

Adjuvants

While KLH challenges with IFA adjuvant had smaller positive response rates on both IgM and IgG (88 and 82% at 21 days, respectively) compared to KLH without an adjuvant (97 and 90% at 21 days, respectively), these differences were not statistically significant (p = 0.2 for IgM and p = 1.0 for IgG, respectively) and IFA was concluded to not significantly impact the rate of positive responses observed after administration of KLH alone. A small number of NHP had TT challenge with the Alum adjuvant (N = 10). Of these, 50% responded positively to the challenge on or before Day 10, compared to 88% positive response for NHP with TT alone (not statistically significant, p = 0.2). The corresponding comparison at Day 21 was 50 vs. 94% (not statistically significant, p = 0.07).

Number of antigens

For all analyses where single and multiple antigens could be compared, the response after immunization with multiple antigens was slightly for KLH and significantly higher for TT, compared with the response to immunization with a single antigen. This contrast between single and multiple antigens was more pronounced and statistically significant for the response to TT by Day 10 and Day 21 (p = 0.001 and p < 0.001, respectively).

Gender

Positive response rates were similar between males and females in all comparisons presented in . For most of the immunization conditions, a generalized linear mixed model for gender could not be estimated; however, no significant differences were noted. The largest gender difference was observed in the IgG response to TT immunization: 7 of 11 female (64%) and 12 of 15 male (80%) NHP were positive on or before Day 10. These rate estimates are, however, calculated from a small number of NHP (11 female and 15 male NHP); the difference was not statistically significant (p = 0.4) and rates were equivalent on Day 21 (male 84%; female 85%).

NHP origin

Despite a low number of animals (n = 4 and 5, respectively), Mauritius and Philippine NHP had positive responses to KLH (both IgM and IgG) for all 10- and 21-day analyses. In the six analyses where two or more origins could be compared (), there were no statistically significant differences in response rates among origins. Descriptively (and a description that may be due to chance), NHP from China had the lowest response rate in five of the six origin comparisons that could be made from the data (see ). Indonesia had a slightly lower response rate than Mauritius in two out of four comparisons, and slightly lower than The Philippines in two out of three comparisons. For the KLH challenge, positive response rates vary between 82 and 100% for animals of different origins. For the TT challenge, only China and Indonesia could be compared, and only on the IgG response. The positive response comparison for these two origins reverses between Days 10 and 21 and is inconclusive, possibly due to limited sample sizes.

Primary vs. secondary response rates

In , the comparison of the rates of positive responses to immunization with either KLH or TT showed that rates of positive secondary IgG responses were slightly greater than rates of positive primary IgG responses. For KLH, secondary responses were 100% positive, compared with primary responses of 79% on or before Day 10 and 87% on or before Day 21 (p < 0.001).

Table 4.  Response rates: primary vs. secondary challenge.

Peak response magnitude analysis

The peak day ranged from Day 2 to Day 42 across the different situations (combination of study, antigen, and antibody isotype), except for one study where no early samples were available and IgG response to SRBC was measured at day 150. Most common values of the peak day for either IgM or IgG were Day 7 (26% of situations), Day 14 (24% of situations), Day 10 (11% of situations) and Day 21 (9% of situations) after immunization.

Comparison of primary and secondary responses for mean peak response values

The difference in mean peak response values between primary and secondary challenges is shown in . Seven studies were included in this analysis. Two studies had both IgM and IgG data included, and one study had both TT and KLH antigens, resulting in 10 possible comparisons. The difference between primary and secondary responses is shown separately for each antibody subtype and antigen as fold differences (a ratio of secondary response value to the primary value). As expected, based on immunoglobulin biology, mean IgG secondary:primary response fold differences were generally higher than 1, whereas mean IgM secondary:primary fold differences were approximately 1 or lower.

Figure 1.  Fold difference in mean peak response between primary and secondary challenges (X axis) for selected comparisons from individual studies (Y axis). Bars around the point estimate display the 95% confidence interval. Point estimates and 95% confidence intervals were calculated by a linear mixed model on the loge scale and then back-transformed to the original scale. Red color denotes higher mean for secondary response and black higher mean for primary response. Comparison 1 = IgM, SRBC; comparison 2 = IgM, KLH; comparisons 3–5 = IgG, TT; comparison 6 = IgG, SRBC; comparisons 7–10 = IgG, KLH.

Figure 1.  Fold difference in mean peak response between primary and secondary challenges (X axis) for selected comparisons from individual studies (Y axis). Bars around the point estimate display the 95% confidence interval. Point estimates and 95% confidence intervals were calculated by a linear mixed model on the loge scale and then back-transformed to the original scale. Red color denotes higher mean for secondary response and black higher mean for primary response. Comparison 1 = IgM, SRBC; comparison 2 = IgM, KLH; comparisons 3–5 = IgG, TT; comparison 6 = IgG, SRBC; comparisons 7–10 = IgG, KLH.

Effect of gender on mean peak response values

compares mean peak log-response between the genders within a given study and challenge. The difference between the two genders (male minus female) is calculated on the loge scale and displayed as a fold-change on the original scale. The estimated difference between the genders is color-coded according to the difference between the two genders: red color denotes higher mean for males, black higher mean for females and green equivalent mean for males and females. This comparison of IgG and IgM response values from numerous studies that used both male and female NHP shows that the pooled difference between genders is small (0.02 on the loge scale) and not statistically significant (p = 0.9 for a comparison to a zero difference between genders). The similarity between the two genders was also fairly consistent across studies (p = 1.0 for comparison of the variability in the magnitude of M-F difference across studies).

Figure 2.  Difference in mean peak response between male and female NHPs (X axis) for selected comparisons from individual studies (Y axis). Bars around the point estimate display the 95% confidence interval for the estimate. Point estimates and 95% confidence intervals were calculated by a linear mixed model on the loge scale and then back-transformed to the original scale. Range of the X axis was limited to 0.01–100. The 95% confidence intervals for some of the smaller studies exceed this range. Red color denotes higher mean for males, black higher mean for females and green equivalent mean for males and females. Comparison 1 = secondary, IgM, KLH; comparisons 2 and 3 = secondary, IgG, TT; comparison 4 = secondary, IgG, SRBC; comparisons 5–11 = secondary, IgG, KLH; comparison 12 = primary, IgM, TT; comparison 13 and 14 = primary, IgM, SRBC; comparisons 15–20 = primary, IgM, KLH; comparisons 21–23 = primary, IgG, TT; comparisons 24–27 = primary, IgG, SRBC; comparisons 28–36 = primary, IgG, KLH.

Figure 2.  Difference in mean peak response between male and female NHPs (X axis) for selected comparisons from individual studies (Y axis). Bars around the point estimate display the 95% confidence interval for the estimate. Point estimates and 95% confidence intervals were calculated by a linear mixed model on the loge scale and then back-transformed to the original scale. Range of the X axis was limited to 0.01–100. The 95% confidence intervals for some of the smaller studies exceed this range. Red color denotes higher mean for males, black higher mean for females and green equivalent mean for males and females. Comparison 1 = secondary, IgM, KLH; comparisons 2 and 3 = secondary, IgG, TT; comparison 4 = secondary, IgG, SRBC; comparisons 5–11 = secondary, IgG, KLH; comparison 12 = primary, IgM, TT; comparison 13 and 14 = primary, IgM, SRBC; comparisons 15–20 = primary, IgM, KLH; comparisons 21–23 = primary, IgG, TT; comparisons 24–27 = primary, IgG, SRBC; comparisons 28–36 = primary, IgG, KLH.

Effect of adjuvant on mean peak response values

In , the comparison of the mean peak log values between immunization with KLH and KLH with IFA in animals from the same study shows that for both primary and secondary IgG responses, co-administration of IFA with KLH leads to higher peak responses. Further assessment of the impact of adjuvants on the magnitude of peak responses was not possible given the heterogeneity of study designs.

Table 5.  Effect of adjuvant on mean log peak response: KLH vs. KLH with IFA.

Peak response inter-animal variability analysis

Comparison of primary and secondary peak responses for inter-animal variability

In , the comparison of the between-animal SD at peak (on the loge scale) between the primary and secondary responses show that secondary responses, on the average, were 16% more variable than primary responses (geometric mean of ratio of secondary SD to primary SD = 1.16); however, the difference was not statistically significant (p = 0.4 for the comparison of ratios to 1, one sample t-test on the log ratios for the comparisons in ).

Figure 3.  Ratio of the between-animal SD at peak between the primary and secondary response (X axis) for comparisons from selected individual studies (Y axis). The SDs from primary and secondary responses have been calculated on the loge scale. The bars represent the 95% confidence intervals around the estimates of the SD (secondary/primary) ratios, based on the F statistic. Vertical line at ratio = 1 denotes situation where inter-animal standard deviations for primary and secondary responses are identical. Red color denotes higher SD for secondary response, black higher SD for primary response and green equivalent SDs. Comparison 1 = IgM, KLH; comparisons 2–4 = IgG, TT; comparison 5 = IgG, SRBC; comparisons 6–9 = IgG, KLH.

Figure 3.  Ratio of the between-animal SD at peak between the primary and secondary response (X axis) for comparisons from selected individual studies (Y axis). The SDs from primary and secondary responses have been calculated on the loge scale. The bars represent the 95% confidence intervals around the estimates of the SD (secondary/primary) ratios, based on the F statistic. Vertical line at ratio = 1 denotes situation where inter-animal standard deviations for primary and secondary responses are identical. Red color denotes higher SD for secondary response, black higher SD for primary response and green equivalent SDs. Comparison 1 = IgM, KLH; comparisons 2–4 = IgG, TT; comparison 5 = IgG, SRBC; comparisons 6–9 = IgG, KLH.

Effect of gender on inter-animal variability at peak responses

In , the comparison of the between-animal SD at peak between responses in males and females show that the variability of responses was very similar for the two genders. On the average, responses in males were 8% less variable than responses in females (geometric mean of male/female SD ratios = 0.92). However, the difference was not statistically significant (p = 0.6 for the comparison of all ratios to 1; one sample t-test on log ratios).

Figure 4.  Ratio of the between-animal SD at peak between male and female NHPs (X axis) for comparisons from selected individual studies (Y axis). Ratios of the two SDs (male/female) were derived from the SDs calculated on the loge scale. Bars represent the 95% confidence intervals around the estimates of the SD ratio, based on the F statistic. Vertical line at ratio = 1 denotes situation where inter-animal standard deviations for the two genders responses are identical. Red color denotes higher mean for males, black higher mean for females and green equivalent for males and females. Comparisons 1 and 2 = secondary, IgM, KLH; comparisons 3 and 4 = secondary, IgG, TT; comparison 5 = secondary, IgG, SRBC; comparisons 6–12 = secondary, IgG, KLH; comparisons 13 = primary, IgM, TT; comparisons 14 and 15 = primary, IgM, SRBC; comparisons 16–21 = primary, IgM, KLH; comparisons 22–26 = primary, IgG, TT; comparisons 27–30 = primary, IgG, SRBC; comparisons 31–41 = primary, IgG, KLH.

Figure 4.  Ratio of the between-animal SD at peak between male and female NHPs (X axis) for comparisons from selected individual studies (Y axis). Ratios of the two SDs (male/female) were derived from the SDs calculated on the loge scale. Bars represent the 95% confidence intervals around the estimates of the SD ratio, based on the F statistic. Vertical line at ratio = 1 denotes situation where inter-animal standard deviations for the two genders responses are identical. Red color denotes higher mean for males, black higher mean for females and green equivalent for males and females. Comparisons 1 and 2 = secondary, IgM, KLH; comparisons 3 and 4 = secondary, IgG, TT; comparison 5 = secondary, IgG, SRBC; comparisons 6–12 = secondary, IgG, KLH; comparisons 13 = primary, IgM, TT; comparisons 14 and 15 = primary, IgM, SRBC; comparisons 16–21 = primary, IgM, KLH; comparisons 22–26 = primary, IgG, TT; comparisons 27–30 = primary, IgG, SRBC; comparisons 31–41 = primary, IgG, KLH.

Effect of adjuvants on inter-animal variability at peak responses

The assessment of the effect of adjuvants was limited due to a small number of animals (N = 3) that were subjected to a challenge with an adjuvant and had peak response defined. Each of these animals had both primary and secondary responses with KLH accompanied by IFA. Their response was measured for the IgG antibodies and was expressed in center-point units. The inter-animal SD of the peak primary log-response for these three animals was 0.51. This variability was not significantly different from the variability of the primary response for two studies without an adjuvant (p = 0.9, Bartlett’s test) that were also measured by center-point for IgG. The standard deviations (on the loge scale) for these two non-adjuvant studies were 0.77 (N = 9) and 0.06 (N = 4), respectively, for a pooled SD 0.64. The variability for the peak secondary response was also not significantly different between the three animals with the adjuvant and the 43 animals from six studies where KLH was used without an adjuvant (p = 0.6, Bartlett’s test). The inter-animal SD on the loge scale for the three animals with the IFA adjuvants was 0.52, whereas it ranged from 0.14 to 0.90 for the six studies without adjuvant (pooled SD 0.74).

Because of the small number of animals with IFA, it was difficult to make a precise assessment on the effect of the adjuvant. As there were no obvious differences between adjuvant and non-adjuvant standard deviations in this limited comparison, the data were pooled for the remaining analyses and no effort was made to control for adjuvant use in the antigen analysis or power calculations for peak responses.

Effect of type of challenge and NHP origin on inter-animal variability of the peak response

The estimated inter-animal SD for different NHP with various challenges and the four selected measurement units are shown in . The two rightmost columns of the table show the numeric results for variability. The column “SD of log-response” shows the SD estimate using loge -transformed data and is useful in power and sample size analysis. The final column, “Relative variability on original scale eSD” is useful in describing typical fold differences between animals in their response on the designated peak day. For example, the first entry in this column indicates that 1.32-fold differences (32% differences) between animals (on the original scale) would commonly be found. The range of these fold differences in is 1.16-fold (16% differences) to 9.21 (9-fold differences between individual animals). These analyses included 2–28 animals and from one to three studies, as shown in the table. Among categories in that had 10 or more animals (and one to three studies), the range is 1.32–4.95-fold differences between individual animals. The median value of variability is 2.1-fold for all analyses in . While these are typical fold differences between individual animals, it would not be uncommon to find smaller or larger differences, up to the square of the fold difference. For example, for the median value of 2.1-fold, it would not be uncommon to find 4.4-fold differences among individual animals in their response to antigen on the designated peak day.

Table 6.  Between-animal standard deviation of loge response on peak response day by type of challenge, units of measurement, primary (1°)/secondary (2°) challenge, single/multiple challenge, and origin.

All the responses measured by estimated titer units were very discrete (typically there were only 4–6 unique values per study) and are likely to be imprecise. The between-animal SDs at peak response calculated from these values are accordingly imprecise, reflecting both the true between-animal variability and measurement error. Six studies using estimated titers for the IgM measurements had estimated between-animal SDs of zero (n = 2–4 animals per study); based on this, low variability and the discrete nature of the estimated titer scale; these studies were excluded from the presentation.

As can be deduced from , restricted comparison of challenges and NHP origins only for studies using the same measurement units resulted in a rather small number of numerical comparisons. These are shown in . Among the largest comparison groups, IgG responses measured by center-point titer, a comparison was made of 5 studies with a total of 31 NHP challenged with KLH to 3 studies with a total of 27 NHP challenged with TT. The pooled inter-animal SD at peak response was similar for the two groups (2.10-fold SD for KLH vs. 2.03-fold SD for TT; p = 0.7). Other comparisons of inter-animal variability in showed only minor and non-significant differences, except the following. A modest and marginally significant difference in variability of IgG response to KLH vs. SRBC was measured by estimated titer (p = 0.06). For IgG measured by concentration the response to TT had significantly higher variability than the response to KLH (1.55-fold SD for KLH vs. 4.66-fold SD for TT; p < 0.001) and the response for Chinese NHP was significantly more variable than for Indonesian NHP (1.16-fold SD for KLH vs. 4.66-fold SD for TT; p < 0.001). These two results for the variability in peak IgG measured by concentration were inter-related: all TT NHP were of Chinese origin and came from the same two studies, and all KLH NHP were of Indonesian or unknown origin. Since the two factors—origin and type of challenge—were completely confounded, it was not possible to conduct a multivariate analysis to separate the effect of these two factors on the peak variability.

Table 7.  Comparisons of variability at peak response: Primary and secondary responses included.

Because of the disparity in the measurement units and the potential confounding among predictors, it was not possible to conduct a meaningful multivariate analysis to assess the joint effect of multiple factors on peak variability.

Power calculations

Power calculations were conducted to illustrate the kind of differences (expressed as fold differences of means) that are very likely to be detected with several different sample sizes. Two treatment groups are assumed to have equal sample sizes. The power calculation is derived from the between-animal variability observed for some cases reported in this study (). Between-animal variability is measured as the pooled standard deviation of loge response on the peak response day. describes the sets of conditions and measurement units that were amenable to such an analysis and illustrates the fold differences that were very likely to be detected between the geometric means of two groups for a variety of sample sizes and one sample taken on a designated outcome sampling day. It cab be noted that, for small sample sizes, very large treatment differences would have to be generated for all of the scenarios in except one scenario reported for KLH, Indonesia, IgG, concentration. For n = 3 NHP per group, for example, the difference in treatment response would have to be about 10-fold or larger, except for the above noted scenario.

Table 8.  Detectable fold differences* between two treatment groups. Assumptions: two-sided, independent samples t-test on loge data, one sample response value per animal, 80% power and p < 0.05 for statistical significance.

Discussion and conclusions

The cooperation of eight sponsoring companies enabled collection of data on 184 NHP from a diversity of studies and testing sites. Most of the comparisons were of limited scope because of the differences in study design (particularly the different schedules of sampling days and the diverse units of measurement), and the need for some analyses to divide the original dataset into studies that used the same measurement unit to report TDAR antibody responses. By converting the data into binary response variables in the positive primary/secondary response analysis, the limitations associated with diverse assay formats could be partially overcome. Use of the log-transformation of peak responses also facilitated pooling of studies for analyses of response magnitude. In addition, comparisons of the two genders and peak primary vs. secondary responses were generally more reliable, because it was possible to compare animals within the same study, eliminating the study effect and some of the problems from differences in measurement units. Sample sizes for many of the comparisons were rather small. The uncertainty in such comparisons was too large to warrant a definite conclusion. In some cases, direct comparisons between groups were not possible at all.

Overall, the analysis of the rates of positive primary and secondary responses showed that: (i) high rates of positive primary IgG responses are achieved within 21 days of immunization and high rates of positive secondary IgG responses are achieved within 10 days of immunization; (ii) the rates of positive secondary responses are generally greater than the rates of positive primary responses; and, (iii) the NHP origin and gender and the use of an adjuvant do not significantly impact the rate of positive responses. Peak responses were mostly observed within 21 days after immunization and were similar in male and female NHP. Mean peak secondary responses were greater than peak primary responses for IgG and the limited data set available showed that administering KLH with IFA was associated with increased magnitude of peak responses.

One critical aspect of the interpretation of TDAR results in NHP studies is the limitation or difficulty associated with the inter-animal variability in the magnitude of the antibody response. It was found that the value of variability in peak responses was 2.1-fold for a typical (median) combination of monkey and study characteristics. Moreover, even higher variability (up to the square of the fold difference; e.g., 4.4-fold differences) may be observed among individual animals in their response to antigen on the designated peak day. This study shows that inter-animal variability is similar in primary and secondary responses, whereas it is generally higher in secondary responses; this does not translate in statistically significant differences. Therefore, studying either primary or secondary responses is appropriate from the standpoint of animal-to-animal variability. While no statistically significant difference in animal-to-animal variability was observed with and without co-administration of an IFA with the KLH, it should be noted that this is based on a limited data set (n = 3). Moreover, KLH and TT were associated with comparable inter-animal variability, and while SRBC was associated with a slightly higher variability in comparison to KLH, this difference was not statistically significant. Importantly, gender had little effect on either the magnitude or the variability of these responses. The fact that male and female cynomolgus macaques showed similar responses is important as it emphasizes that in standard toxicology studies, while male and female results need to be both evaluated, the statistical analysis of treatment related effects on TDAR, if utilized, may be conducted by combining male and female results. This combination of male and female results would need to be conducted after careful comparison of pharmacokinetics of the test article in both genders at each dose level tested and careful consideration for potential gender specific effects on other toxicology or immunology parameters.

The examples from the power analysis presented in illustrate that the animal group sizes in typical standard toxicology studies (e.g., n = 3 to 4 per sex per group if not considering animals dedicated to studying reversibility of toxicities) would usually require large treatment differences to result in statistically significant results and that small numbers of animals may not be adequate to detect limited effects. It should be noted that the issue of inter-animal variability might not prevent from detecting profound effects that may be associated with certain immunosuppressive or immunodepleting agents. In any case, the inter-animal variability needs to be critically considered when designing and selecting sample sizes for TDAR studies to accommodate the objectives of such studies.

In conclusion, while the TDAR is an important component of immunopharmacology and immunotoxicology studies, this study reveals the diversity of study designs, test systems, antigens, conditions of immunization and analytical methods utilized by various investigators. Despite the limitations due to the heterogeneity of the data set analyzed, it can be concluded that all antigens studied are comparable for rates and timing of positive responses, typically peaking within 21 days of immunization, and for inter-animal variability at peak responses. IFA did not significantly impact inter-animal variability. Importantly, male and female NHP responded similarly to immunization with T-cell dependent antigens and may be combined for statistical analysis purposes (if appropriate) as a way to increase the power to detect differences between treatment groups in safety studies of typical size.

Acknowledgments

We thank the Health and Environmental Sciences Institute Immunotoxicology Technical Committee members who contributed to assembling the data set utilized in this retrospective analysis. We thank Raegan O’Lone, Scientific Program Manager, ILSI Health and Environmental Sciences Institute, for her coordination support.

Declaration of interests

This publication stems from a subgroup of the Health and Environmental Sciences Institute Immunotoxicology Technical Committee, whose work is funded through ILSI-HESI.

References

  • Caldwell, R., Guirguis, M., and Kornbrust, E. 2007. Evaluation of various keyhole limpet hemocyanin dosing regimens for the cynomolgus monkey TDAR assay. The Toxicologist. Abstract #1723.
  • CPMP (Committee for Proprietary Medicinal products). 2000. Note for Guidance on Repeated Dose Toxicity. CPMP/SWP/1042/99.
  • Feldmann, M. 1996. Cell cooperation in the antibody response. In: Immunology Fourth Edition (Roitt, I. M., Brostoff, J., and Male, D. K., Eds.), New York: Mosby, pp. 8.1–8.16.
  • FDA (Food and Drug Administration, Center for Drug Evaluation and Research). 2002. Guidance for Industry. Immunotoxicology Evaluation of Investigational New Drugs. Washington, DC.
  • Haggerty, H. G. 2007. Immunotoxicity testing in non-rodent species. J. Immunotoxicol. 4:165–169.
  • ICH (International Conference on Harmonization). 2006. Guidance for Industry. S8 Immunotoxicity Studies for Human Pharmaceuticals. Washington, DC.
  • Kirk, S. A., Fraser, S. R., Gordon, D., and Templeton, A. 2008. Evaluation of the T-cell dependent antibody response (TDAR) to tetanus toxoid (TT), in the cynomolgus monkey, in the presence of a known immunosuppressant. The Toxicologist. Abstract #2134.
  • Luster, M. I., Portier, C., Pait, D. G., White, K. L. Jr, Gennings, C., Munson, A. E., and Rosenthal, G. J. 1992. Risk assessment in immunotoxicology. I. Sensitivity and predictability of immune tests. Fundam. Appl. Toxicol. 18:200–210.
  • Luster, M. I., Portier, C., Pait, D. G., Rosenthal, G. J., Germolec, D. R., Corsini, E., Blaylock, B. L., Pollock, P., Kouchi, Y., and Craig, W. 1993. Risk assessment in immunotoxicology. II. Relationships between immune and host resistance tests. Fundam. Appl. Toxicol. 21:71–82.
  • Picotti, J. R. 2008. T-Cell-dependent antibody response tests. In: Immunotoxicology Strategies for Pharmaceutical Safety Assessment (Danuta J. H., and Bussiere J., Eds.), New York: John Wiley and Sons Inc., pp. 67–76.
  • Piccotti, J. R., Alvey, J. D., Reindel, J. F., and Guzman, R. E. 2005. T-cell-dependent antibody response: Assay development in cynomolgus monkeys. J. Immunotoxicol. 2:191–196.
  • Tichenor, J. N., Dumont, C., Coletti, K. S., LeSauteur, L., Calise, D. V., Christn-Piche, M. S., and Satterwhite, C. 2010. A 6-week study to determine the antibody response to FK-506-immunosuppressed cynomolgus monkeys following the subcutaneous administration of keyhole limpet hemocyanin and tetanus toxoid in the presence or absence of incomplete Freud’s adjuvant. The Toxicologist. Abstract #1989.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.