Reproductive milestones and oral contraceptive timing predict late-life mortality

Table of Contents

Cox proportional hazards models

The age at first oral contraceptive pill, and reproductive milestones of age at menarche, age at first sexual intercourse, age at first birth, age at menopause, and parity (Fig. 1 and Supplementary Table 1), were central predictors of mortality in UK women. These factors had effect sizes comparable to or larger than other major influences on survival such as education and income levels (Table 1) and retained their predictive value for mortality up to five decades after expression.

**Fig. 1: Reproductive milestones for women in the UK biobank.**

Table 1 Preregistered and exploratory Cox proportional hazards models

This predictive accuracy was built on the long-term accurate recall of reproductive milestones. Individuals in the UK Biobank reported multiple responses to the same questions, separated by several years, allowing measurement of test-retest accuracy (Fig. 2a, c, d) and recall bias (Fig. 2b). Reproductive milestones were recalled accurately, ranging from the lowest recall accuracy for age at menopause (over-60s at baseline only; 43% identical from wave 1–2; mean absolute error or MAE 1.5 years) and the highest for total parity (99.2% identical from wave 1–2; MAE 0.014; test restricted to over-50s at baseline). There was no substantial indication of recall biases within individuals, with only one significant, but in our opinion not functionally meaningful, relationship between follow-up time and reported age at first oral contraceptive use which increased by 36 days per decade after baseline (Fig. 2b; r = 0.04; p = 0.002; N = 7941 repeat tests; Supplementary Code).

**Fig. 2: High accuracy and negligible recall bias in self-reported reproductive histories.**

Some 8116 certified deaths, classified under the International Classification of Disease ICD-10 codes ( provided comprehensive age-specific mortality data across the follow-up period. Cox proportional hazards models were fit as preregistered ( to all deaths not classified as accidental or extrinsic (ICD-10 codes V00-Y99 inclusive; Supplementary Code), returning substantial and significant effects for outcome variables in all preregistered models.

Age at first birth was a significant predictor of post-reproductive mortality risk under the stratified Cox proportional hazards model (z-score −3.4; N = 182,034; p = 0.0007; Table 1 and Fig. 3a) and later ages at first birth predicted lower mortality risks, independent of income, education, the Townsend deprivation index, smoking, and alcohol consumption (Fig. 3a). This effect was independent of oral contraceptive patterns, with age at first birth retaining significance and a similar functional form when correcting for oral contraceptive use and timing (Supplementary Fig. 1). Age at first birth did not, however, retain significance (p > 0.05) when generating a comprehensive model to predict mortality risk for women with nonzero parity, fitting preregistered socioeconomic factors and age at first birth, age at menarche, total parity, age at first sexual intercourse, and age at first oral contraceptive use as fixed effects (N = 131,333; N = 3183 deaths; exploratory analysis; Supplementary Code; Supplementary Information). However, this non-significant result may have arisen from a loss of power when requiring nonzero parity, which halved sample size, and the expanded parameters of the model. Searching for effects in highly stratified models, even in such large samples, is constrained by the number of events²⁵ and often suffers a rapid loss of power as multiplicative effects are added²⁶.

**Fig. 3: Functional forms of relative mortality risk predicted using preregistered Cox proportional hazards models.**

Oral contraceptive timing was a major predictor of mortality risk under stratified Cox proportional hazards models (N = 209,403; z-score 3.7; p = 0.0002; Fig. 3b and Table 1). This predictive capacity persisted for decades after use. Models retained significance even after restricting analysis to individuals aged 60 and over at baseline (exploratory analysis; z-score = 2.24; p = 0.025; Supplementary Information), an average 39 years after starting and 30 years after ending oral contraception. After stratifying models for age at study enrollment—the age at baseline—the effect of oral contraceptive timing was the only variable that failed the z-test for proportional hazards at a local scale²⁷ (p = 0.002; Supplementary Information). This shift in model coefficients—standardised to a comparable z-score using the Wald statistic—may reflect changes in dose, safety, or type of oral contraceptives taken in earlier life^22,28,29 that could not be differentiated in the provided data, or non-exclusively by a nonlinear change in mortality risk that would be associated with a modification of ageing trade-offs.

Oral contraceptive timing retained predictive value for late-life mortality risk when included in a combined exploratory ‘all-in-one’ comprehensive model of reproductive milestones (z-score 3.2; p = 0.002) that added age at menarche, total parity, and age at first sex to the preregistered model structure (exploratory analysis; Supplementary Information; Table 1). Of these milestones, only age at first sex was not a significant predictor of mortality risk (Table 1). Associations with oral contraceptive timing did not, therefore, seem to act as a proxy for the overall rate of reproductive development: oral contraceptive timing retains predictive value in composite models and has negative model coefficients that contrast with the positive model coefficients reproductive milestones (Supplementary Information).

In contrast, the last age at oral contraceptives use was not a significant predictor of mortality (p = 0.07; N = 188,160; Supplementary Information) despite capturing oral contraceptive use much closer to the study date. Likewise, the estimated maximum lifetime dose of oral contraceptives, approximated using the difference between first and last oral contraceptives use for women with completed fertility schedules, also failed to predict subsequent mortality risk (Cox proportional hazards model; N = 178,995 women aged over age 50 at baseline; p = 0.61; Supplementary Information). These findings support the preregistered hypothesis where the timing of oral contraceptives, rather than the estimated maximum lifetime dose or the most recent dose, are a driver of mortality risk differentials.

As preregistered, age at menarche and age at menopause were both tested as predictors of mortality using Cox proportional hazards models. Both models indicated that later onset of these reproductive milestones was linked to highly significant increases in mortality risk (z-scores of −5.0 and −5.7 respectively; p < 1 × 10⁻⁶; Table 1; Supplementary Information): each 1-year increase in the age menarche was associated with an approximately 3.5% increase in all-cause mortality risk, while a single year increase in age at menopause predicted a 1.4% increase in all-cause mortality. These are notable effect sizes in the context of public health: these per-year effect sizes are comparable, for example, to the increased mortality risk of seasonal influenza epidemics³⁰. As with oral contraceptive timing, however, these simple linear coefficients masked nonlinear residuals indicative of more complex interactions (Fig. 3c and Supplementary Figs. 1 and 2).

Nonlinear functional forms were explored by removing target variables from each preregistered model and plotting mortality risk of these ‘socioeconomic-only’ mortality models against variation in target variables (Fig. 3d–f; exploratory analysis; Supplementary Code). Aligning with the ‘trade-off’ hypothesis, later ages at first birth were associated with lower relative mortality risk (Fig. 3a). More complex nonlinear interactions were evident in other data. The timing of oral contraception approximated the effect of age at first birth but had a more U-shaped effect on predicted mortality risk (Fig. 3b), while early or late ages at menarche were associated with elevations in predicted mortality risk (Fig. 3c): a strongly U-shaped pattern. A similar U-shaped pattern was observed for age at menopause (Supplementary Fig. 2) but this model warranted some caution given the substantially smaller sample sizes, incomplete reporting of menopause at baseline, and less reliable recall accuracy.

Age at menarche had the greatest predictive power for all models that included it as a covariate, independent of later reproductive milestones (Fig. 4d and Table 1; Supplementary Information), with greater predictive power for late-life mortality than age at menopause, age at first birth, oral contraceptive use, and even total parity, which all displayed independent predictive effects when included in the same model (Table 1; Supplementary Information).

**Fig. 4: Cross-sectional hazard rates observed during follow-up in matched cohorts.**

Propensity matched cohort models

The mortality risk differentials associated with oral contraception use, early births, and differential age at menarche were substantial. However, mortality risk is not a product of aging rate variation—measured by the mortality rate doubling time—alone. Overall mortality patterns are a composite of complex age-dependent and age-independent components—in other words, mortality rates indicate how bad things are, while aging rates and mortality rate doubling times indicate how fast they become worse. It was not clear, therefore, if mortality differentials observed under these models were a result of aging rate differentials, overall age-independent changes in mortality risk, or both.

To explore this possibility, mortality rate doubling times (indicative of actuarial aging rates^31,32) were measured using both a preregistered analysis of cross-sectional data (Fig. 4 and Supplementary Fig. 3), and an additional exploratory analysis to measure longitudinal on-study mortality accelerations during follow-up (Supplementary Fig. 4).

As preregistered, large propensity-matched cohorts were generated which differed only by variation in the target variable: age at first oral contraceptive use (above or below median; N = 50,604 per cohort; and lowest- versus last-three quartiles; N = 49,147 per cohort; Supplementary Table 2), ever- versus never-users of oral contraceptives (N = 50,181 per cohort), age at first birth (above or below median; N = 60,190 per cohort), or age at menarche (above or below median; N = 95,315 per cohort). Each paired cohort was matched exactly by age at baseline and matched by nearest-neighbour joining for education, income, the Townsend deprivation index, smoking rates, and alcohol intakes (logit distance with caliper of 0.05; Supplementary Information) ensuring identical age structure and socioeconomic factors for each paired cohort³³.

The mortality rate doubling times of these matched cohorts accelerated at similar rates, when assessed in cross-sectional data, failing to reach significance (Fig. 4 and Supplementary Fig. 3; Supplementary Information) with one marginal exception (Fig. 4d) that disappeared when correcting for multiple testing. When hazard rates were calculated annually, individuals who took oral contraception underwent slower aging than the matched cohort when aging was measured by mortality rate doubling times (F-value 4.47; p = 0.04; Supplementary Fig. 3b and Supplementary Table 2). In addition to losing significance under the Benjamini–Hochberg correction for multiple testing (p > 0.05), this effect also lost significance (p = 0.09) when binning the same data into 5-year categories and re-fitting the model (Fig. 4b and Supplementary Table 2), highlighting the intended purpose of preregistering both annual and quinquennial bins. This outcome should be considered a non-result.

We made an exploratory attempt to circumvent these issues by comparing the mortality rate doubling times in matched cohorts observed during the extensive 11-year follow-up time in the UK biobank to Jan 2020. These within-study mortality rate doubling times revealed complex outputs and were complicated by ascertainment biases in survival patterns (Supplementary Fig. 4). The observed acceleration of mortality rates with age were unexpectedly low and deviated strongly from a log-linear pattern over the first 2 years of observation (Supplementary Fig. 4a, c, e, g), suggesting substantial ‘healthy volunteer’ ascertainment biases. After removing this bias, by calculating mortality acceleration rates after 2 years on-study (Fig. 4a, c, e, g) or by measuring differences in residual mortality between cohorts (Fig. 4b, d, f, h), the timing of oral contraception (trimmed data p = 0.02; residual regression p = 0.003; Supplementary Fig. 4a, b) and the use of oral contraception (trimmed data p > 0.05; regression of residuals p = 0.03; Supplementary Fig. 4c, d) gained marginal significance. These results contrasted with cross-sectional data, which indicated the opposite trends in mortality rate doubling times (Supplementary Fig. 3a), only one of p values survived correction for multiple testing (Supplementary Fig. 4b), and the results were again considered non-informative.

As such, it remains unclear if significant mortality risk differentials, observed for women who reached reproductive thresholds or used oral contraceptives at different ages, are associated with differences in the underlying rate of aging. Observed changes in mortality between cohorts may be due to differential aging, constant advantages in frailty or age-independent mortality rates, or some combination of both effects.

Cohort-based data generally supported the mortality differentials uncovered in greater detail by the Cox proportional hazards models. However, these models were limited in their ability to untangle mortality patterns, as they were restricted to binary comparisons of two cohorts and therefore cannot capture nonlinear trends as observed in Fig. 4. In paired cohorts with identical age distributions, incomes, education, Townsend deprivation indices, smoking rates, and alcohol intake, a simple nonparametric two-sample proportion test³⁴ (exploratory analysis; Supplementary Code) indicated significantly lower mortality risk in oral contraceptive users (X-squared 5.58; p = 0.02), individuals with an above-median age at first birth (median 25 years; X-squared 9.68; p = 0.002) or age at menarche (median 14 years; X-squared 9.37; p = 0.002). No significant difference in mortality for individuals with above/below median ages at first oral contraceptive use were apparent (median 21 years; X-squared 2.95; p = 0.09; Supplementary Table 2) likely as a combined result of the requirement of a binary comparison between cohorts, and the partially U-shaped distribution of mortality risk associated with oral contraceptive timing (Fig. 3b).

Improvements in all-cause mortality overwhelmed any potential mortality costs associated with the established clinical side effects of oral contraception. In our large matched cohorts, oral contraception use was not associated with a significant elevation in mortality risk in thrombosis and pulmonary embolisms²⁸ (18 deaths in users, 16 in non-users), breast cancer^28,35,36 (194 deaths in users, 180 in non-users), cervical cancer^28,35 (<10 deaths per cohort), or cardiovascular disease deaths (ICD-10 codes I20-I25; 111 deaths in users, 139 in non-users; exploratory analysis; N = 100,362; p > 0.05 in all comparisons; Supplementary Code): diseases with prior evidence for elevated mortality risks during oral contraceptive use. These outcomes should be treated as an absence of evidence, not a strong non-result, as we had low power to detect mortality differentials in matched cohorts and comparisons of matched cohorts are restricted to simple pairwise comparisons and the observation period was usually long after active use of contraceptives. Our results therefore support previous findings that the costs of oral contraceptive use, while forming a clear clinical risk, are more than offset by observed benefits^22,28,29.

Again, it was unclear if the observed, highly significant mortality differences between cohorts were driven by fixed age-independent changes in mortality risk across all ages, changes in mortality rate doubling times (indicating differential aging), or both. The most we could say was that there was no evidence to support differential aging rates between matched cohorts, although power to detect such effects was low.

Linear models for proxy measures of physiological aging

This problem was not clarified by our preregistered analysis of physical indicators of aging: forced expiratory volume in one second (FEV1), average left + right hand grip strength (HGS), and self-reported health (SRH). With the Cox models and matched cohort analyses, preregistered linear models returned overwhelmingly significant results of large effect size. Under our preregistered models (Supplementary Code), oral contraceptive use and timing, and age at first birth were highly significant predictors of FEV1, HGS, and SRH. However, this linear modelling section of the pre-registered study was removed. Surprisingly, our preregistered indicators of physiological aging seemed to have no clear value for informing aging rates in the UK Biobank.

Exogenous indicators of aging should, obviously, share aging as a latent cause, and therefore display some covariance within individuals. Within individuals measured within a single wave, these measures were repeatable and concordant: left and right hand-grip strength were highly correlated (r = 0.80; N = 270,257; p < 2.2 × 10⁻¹⁶), despite variance caused by hand-dominance, as were consecutive tests of FEV1 (r = 0.93; N = 247,857; p < 2.2 × 10⁻¹⁶). When tested across waves, however, none of our putative indicators of aging displayed any meaningful covariance structure (Supplementary Fig. 5a–d; exploratory analysis; Supplementary Code). The rate at which our selected aging indicators decayed with age was either orthogonal or scarcely correlated across measures (Supplementary Fig. 5). While FEV1 and HGS both decayed with age, the rate at which they decayed with age was at best very weakly correlated within individuals (r = 0.1; N = 26,226; Supplementary Fig. 5d). That is, despite these metrics being correlated with each other cross-sectionally (r = 0.36; N = 246,377; p < 2.2 × 10⁻¹⁶), and despite substantial sample sizes and long follow-up periods, the rate at which aerobic capacity and physical strength declined with age was not meaningfully related. One of the selected indicators, SRH, did not even decay longitudinally with age (Supplementary Fig. 5c) and, unsurprisingly, rates of change in SRH were therefore not meaningfully correlated with longitudinal changes in HGS (r = 0.07; N = 30,282; p < 2.2 × 10⁻¹⁶) or FEV1 (r = 0.05; N = 26,327; p = 3 × 10⁻¹⁴). In the few hundred individuals attending all four waves, the rate of decline in HGS and FEV1 measured between waves 1 and 2 was not even correlated with later rates of physical decline, measured between waves 3 and 4, in the same person (p > 0.05; N = 579 and 453 respectively; see Glindmeyer et al. for a related finding³⁷; Supplementary Code).

It was therefore difficult to understand why these physical indicators should be near-orthogonal or uncorrelated within the same individual, if they shared aging as a latent cause of any discernible value. This unexpected result raised fundamental questions on the capacity of these variables to capture latent variation in aging rates in the UK Biobank data, and the preregistered analysis of these indicators was dropped.

link