Life expectancy estimates are affected by missing data and methodological assumptions: Why we should not rely on experimental estimates
There is a wealth of evidence showing that people from most minoritised ethnic groups have much poorer health than the White British group. These studies have shown that not only do ethnic inequalities in health exist, but that they have persisted over time, and are exacerbated in later life. While some studies have found a disconnect between health and mortality outcomes, it is theoretically plausible, and is indeed evidenced in international studies, that ethnic inequalities in poor health are followed by ethnic inequalities in mortality.
It is therefore surprising that life expectancy estimates from the Office for National Statistics (ONS) show that minoritised ethnic groups in England and Wales have exceedingly high life expectancies, especially at the oldest ages, despite having some of the poorest health outcomes throughout their lifetime. For example, the 2011-2014 life expectancy estimates suggested that an 80-84 year old Bangladeshi woman would live a further 15.5 years (compared to an additional 9.9 years for a White woman of the same age), an 85-89 year old Bangladeshi woman a further 13.5 years (compared to an additional 7.2 years for a White woman of the same age), and a Bangladeshi woman over the age of 90 a further 11 years (compared to an additional 5.3 years for a White woman of the same age). As we mentioned above, these are surprising and counter-intuitive findings since research using the 2011 Census found that Bangladeshi women had some of the highest rates of poor health across all age groups, equivalent to those of White women who were 30 years older. Based on well-established associations between morbidity and mortality, one would expect Bangladeshi women (who live in worse health than White women) to experience shorter, not longer, life expectancy.
This presents a complicated picture. Do people with the poorest health really live the longest? Should we accept that poor health in life and risk of mortality are not related? Or is something more complex happening? In a previous blog we discussed our concerns with these experimental statistics. In this blog we take a closer look at the approach used by the ONS in producing their estimates of life expectancy by ethnicity, examining the data sources and methodology used.
We discovered several ways in which data shortcomings could lead to substantial errors in life expectancy estimates for minoritised ethnic groups. One of the biggest problems was missing data. The ONS’ life expectancy calculations were based on data from the 2011 Census, matched to death registrations in the period 2011-2014. However, in our work, we observed that a large proportion of people go missing from records in the ten years between Censuses. Mapped data was not available for the 2021 Census, but in the period 2001-2011, the proportion of missing people was substantially greater for all minoritised ethnic groups compared with the White British group. Figure 1 outlines this crucial problem. While as many as 9% of White British people go missing between Censuses, this was the case for even more of those from ethnic minority groups – ranging from 12% in the Indian group to 30% among those from the Other ethnic group, and this discrepancy in missingness is greater for older people.
This matters because deriving life expectancy estimates requires two important pieces of information: the population size and the number of deaths within that population. If people are present in one Census but missing in the next, and do not have a known outcome to explain their missingness (i.e. whether they have died or migrated), then they will continue to be counted as alive and become “statistically immortal”. Those who are missing have almost certainly migrated out of the country, and since we can’t count whether and when they have died, this will inevitably under-estimate the mortality rate and lead to artificially high life expectancy estimates, which we may be seeing in the ONS experimental life expectancy estimates. In fact, in international research that addresses the problem of statistical immortality by reliably tracking outcomes for people who have left the country, the mortality advantage of migrants disappears. Due to the cumulative nature of life expectancy calculations over age groups, the problem of missing data is exacerbated in the oldest ages.
Figure 1: Missing status (between 2001 and 2011 censuses) by ethnicity (percentages)
Source: Office for National Statistics Longitudinal Study (ONS-LS) members observed in 2001 and followed up in 2011. The ONS-LS contains linked census and life events data for a 1% sample of the population in England and Wales since 1971.
Methodological considerations when calculating life expectancy estimates
There are two other important issues to consider in calculating life expectancy estimates for minoritised ethnic groups. The first is the “salmon bias”, a phenomenon whereby migrants who are unwell return to their country of origin at the end of their lives. Studies have highlighted that missing data between censuses is not random and that the mortality rate of people who return to their country of birth is higher than those who stay. Indeed, the previously-mentioned research using data from France, found that the risk of mortality for a foreign-born person migrating outside of France is between 1.6 and 3.5 times higher than a foreign-born person who stays in France.
The second issue to consider when calculating life expectancies is that the available data in the UK do not permit us to observe what happens to an individual after they have migrated outside the UK, in terms of mortality. To get around this, we simulated the effects of these factors on life expectancy estimates by replicating the ONS’ life expectancy model, and adjusting its assumptions to reflect the higher amounts of missingness among minoritised ethnic communities, and the increased mortality risk due to salmon bias.
Our main aim was to test the sensitivity of life expectancy estimates by ethnicity to these two issues. We tested several scenarios around salmon bias, firstly, where the mortality rate of people migrating outside the UK was unchanged, then when the mortality rate of that cohort was increased by multipliers similar to those seen in existing research.
The purpose of this research is not to produce definitive estimates of life expectancy (which is not possible to do with the available data), but rather to show how much variability there is in the ONS’ life expectancy estimates when we give full consideration to the unknown factors in the available data. Doing so illuminates the serious problem of applying standard methodologies, such as that used by the ONS, to estimate ethnic differences in life expectancy in England and Wales (and in other countries in the Global North).
Findings and Discussion
We illustrate the variability in life expectancy in figure 2, using the example of Pakistani women, a group who typically have poorer health than White women. The red point on the left of the set of estimates represents the life expectancy estimate we derived when using the ONS’ methodology. The subsequent points represent the models where we have changed assumptions around the amount of migration outside the UK, and the mortality rate of those migrating. We see that there is much more variability in the estimates of the Pakistani cohort when compared with the White cohort when we use different assumptions around the amount of out-migration, and the mortality rate of those who migrate. For Pakistani women, the estimate of life expectancy reduces by several years across our model set, whereas the change is a little over half a year for White women. For more information on the methods and the results for other ethnic groups, see figure 3 and the methods box below.
Figure 2: Scenario testing of LE estimates at birth by ethnic group (women)
We expect that the high variability in our life expectancy estimates are due to the much greater likelihood of minoritised ethnic people disappearing from records and becoming statistically immortal. We accept there are limitations to our analyses, largely due to the incompleteness and inaccessibility of data. Indeed, the data required to produce accurate estimates of life expectancy would require visibility of the life outcomes of those who have left the country, such as pensions data. Unfortunately, these data are not currently available; and in the absence of this data, we consider any life expectancy estimates derived using the standard methodology to be flawed.
A major concern driving our work is how the ONS’ experimental life expectancy statistics are being used as fact by policy and academic stakeholders, despite the ONS labelling them as experimental. These statistics matter, because they directly inform public policy and health service provision, which then impact people’s lives. Given the high sensitivity to different methodological and theoretical assumptions, we urge that these experimental estimates are treated with caution. Further, we recommend that due attention is given to the very real and substantial ethnic inequalities in health observed across the entire life course.
For a more in-depth discussion of the work presented here, please read our report Ethnic Inequalities in Mortality in England And Wales: Examining Life Expectancy Data and Methods, which is available here.
Figure 3: Variability in LE estimates at birth by ethnic group
Methods
To carry out our sensitivity testing, we first had to replicate the ONS’ original estimates. We could then alter some of the parameters used by the ONS in their model to reflect our alternative assumptions around external migration (or outmigration) and the mortality rate of outmigrants. We worked with the ONS researchers who created the original life expectancy estimates, who shared the code they used to produce the estimates, enabling us to understand how their life expectancy estimates had been produced. As researchers external to the ONS, we could only access a reduced version of the dataset that the ONS used, but nonetheless we were able to match the ONS’ life expectancy estimates reasonably closely.
The model parameters we changed were the amount of outmigration from the UK, and the mortality rate of outmigrants. We estimated outmigration using the proportion of people who were lost to follow-up between the 2001 and 2011 Censuses in the ONS-Longitudinal Study, assuming that everyone missing between Censuses had in fact out-migrated (the ONS state that the majority of loss to follow-up can be explained by unrecorded emigration) . We calculated these estimates by age group, ethnic group and sex, and applied these estimates of missingness to our model. We applied a scaling factor to the mortality rate of this cohort of outmigrants to simulate the salmon bias. We based the scaling factors used upon the work of Guillot et al. in their study using French pension data.
Acknowledgements
The permission of the Office for National Statistics to use the Longitudinal Study is gratefully acknowledged, as is the help provided by staff of the Centre for Longitudinal Study Information & User Support (CeLSIUS). CeLSIUS is funded by the ESRC under project ES/V003488/1. The authors alone are responsible for the analysis and interpretation of the data.
This work contains statistical data from ONS which is Crown Copyright. The use of the ONS statistical data in this work does not imply the endorsement of the ONS in relation to the interpretation or analysis of the statistical data. This work uses research datasets which may not exactly reproduce National Statistics aggregates.
This project has been funded by the Nuffield Foundation, but the views expressed are those of the authors and not necessarily the Foundation. Visit www.nuffieldfoundation.org