Skip to contents

Public Health England (PHE) has an excel file that will calculate life expectancy using an abridged life table. The make_life_table() function in this package was written to exactly replicate the formulas and methods used in that excel file. The PHE excel file can be found here.

The make_life_table() function is designed to work with grouped data so that you can calculate life tables for multiple groups at once. It also allows for the use of ACS age groups in addition to the age groups included in the PHE excel calculator.

  • PHE age groups: 0, 1-4, 5-9, 10-14 … 80-84, 85-89, 90+
  • ACS age groups: 0-4, 5-9, 10-14 … 80-84, 85+

make_life_table()

Basic usage

The output from the make_life_table() function has every column that is included in the PHE excel calculator. This example includes the same death and population numbers that are in the PHE excel calculator:

data1 <- data.frame(
  age_cat = c("0", "1-4", "5-9", "10-14", "15-19","20-24", "25-29", "30-34", "35-39", "40-44",
  "45-49", "50-54", "55-59", "60-64", "65-69", "70-74", "75-79", "80-84", "85-89", "90+"),
  deaths = c(206, 37, 23, 23, 105, 162, 268, 314, 413, 584, 954,
            1359, 1912, 2824, 4507, 5851, 7117, 8192, 7745, 6442),
  population = c(50698, 215400, 280458, 258105, 282062, 329060, 306097, 274544, 260415,
           267450, 311314, 324311, 296825, 271339, 284608, 228062, 162785, 111263, 58987, 26016))

life_table1 <- make_life_table(data1)
str(life_table1)
#> 'data.frame':    20 obs. of  20 variables:
#>  $ age_cat            : chr  "0" "1-4" "5-9" "10-14" ...
#>  $ deaths             : num  206 37 23 23 105 162 268 314 413 584 ...
#>  $ population         : num  50698 215400 280458 258105 282062 ...
#>  $ start_age          : num  0 1 5 10 15 20 25 30 35 40 ...
#>  $ int_width          : num  1 4 5 5 5 5 5 5 5 5 ...
#>  $ fract_surv         : num  0.1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
#>  $ death_rate         : num  4.06e-03 1.72e-04 8.20e-05 8.91e-05 3.72e-04 ...
#>  $ prob_dying         : num  0.004048 0.000687 0.00041 0.000445 0.00186 ...
#>  $ prob_surv          : num  0.996 0.999 1 1 0.998 ...
#>  $ num_alive_int      : num  100000 99595 99527 99486 99442 ...
#>  $ num_dying_int      : num  404.8 68.4 40.8 44.3 184.9 ...
#>  $ pers_yrs_lived_int : num  99636 398244 497532 497319 496746 ...
#>  $ pers_yrs_lived_past: num  7832451 7732815 7334572 6837040 6339721 ...
#>  $ obs_le_int         : num  78.3 77.6 73.7 68.7 63.8 ...
#>  $ sample_var         : num  7.92e-08 1.27e-08 7.30e-09 8.62e-09 3.29e-08 ...
#>  $ weighted_var       : num  4888352 724166 367032 374649 1224149 ...
#>  $ sample_var_pers_yrs: num  39355415 34467062 33742896 33375864 33001214 ...
#>  $ sample_var_obs_le  : num  0.00394 0.00347 0.00341 0.00337 0.00334 ...
#>  $ ci_low_95          : num  78.2 77.5 73.6 68.6 63.6 58.8 53.9 49.1 44.4 39.7 ...
#>  $ ci_high_95         : num  78.4 77.8 73.8 68.8 63.9 59 54.1 49.3 44.6 39.9 ...

Additional options

Multiple groups

One benefit to using the make_life_table() function instead of the excel file is that you can create full life tables for multiple groups at once. You just specify the grouping variables with the grouping_vars argument.

Here we have a version of data1 but with an additional set of data that were simulated based on the data entered into the PHE excel file.

data2 <- data.frame(
  group = c(rep("phe", 20), rep("simulated", 20)),
  age_cat = rep(c("0", "1-4", "5-9", "10-14", "15-19","20-24", "25-29", "30-34", "35-39", "40-44",
                 "45-49", "50-54", "55-59", "60-64", "65-69", "70-74", "75-79", "80-84", "85-89", "90+"), 2),
  deaths = c(206, 37, 23, 23, 105, 162, 268, 314, 413, 584, 954,
               1359, 1912, 2824, 4507, 5851, 7117, 8192, 7745, 6442,
               232, 30, 41, 22, 194, 168, 315, 313, 406, 643, 963,
               1446, 1979, 2814, 4587, 5874, 7111, 8221, 7825, 6540),
  population = c(50698, 215400, 280458, 258105, 282062, 329060, 306097, 274544, 260415,
                       267450, 311314, 324311, 296825, 271339, 284608, 228062, 162785, 111263, 58987, 26016,
                       51578, 215512, 279462, 257256, 282348, 329111, 306514, 274397, 259847, 267045,
                       311791, 323739, 297453, 271344, 285047, 227655, 162922, 110554, 58886, 26243))

Then you can create life tables for each group by specifying that grouping_vars = "group". The grouping_vars argument can take a list of variables if you want to group by year and geography (for example).

life_table2 <- make_life_table(data2,
                               grouping_vars = "group")
str(life_table2)
#> 'data.frame':    40 obs. of  21 variables:
#>  $ group              : chr  "phe" "phe" "phe" "phe" ...
#>  $ age_cat            : chr  "0" "1-4" "5-9" "10-14" ...
#>  $ deaths             : num  206 37 23 23 105 162 268 314 413 584 ...
#>  $ population         : num  50698 215400 280458 258105 282062 ...
#>  $ start_age          : num  0 1 5 10 15 20 25 30 35 40 ...
#>  $ int_width          : num  1 4 5 5 5 5 5 5 5 5 ...
#>  $ fract_surv         : num  0.1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
#>  $ death_rate         : num  4.06e-03 1.72e-04 8.20e-05 8.91e-05 3.72e-04 ...
#>  $ prob_dying         : num  0.004048 0.000687 0.00041 0.000445 0.00186 ...
#>  $ prob_surv          : num  0.996 0.999 1 1 0.998 ...
#>  $ num_alive_int      : num  100000 99595 99527 99486 99442 ...
#>  $ num_dying_int      : num  404.8 68.4 40.8 44.3 184.9 ...
#>  $ pers_yrs_lived_int : num  99636 398244 497532 497319 496746 ...
#>  $ pers_yrs_lived_past: num  7832451 7732815 7334572 6837040 6339721 ...
#>  $ obs_le_int         : num  78.3 77.6 73.7 68.7 63.8 ...
#>  $ sample_var         : num  7.92e-08 1.27e-08 7.30e-09 8.62e-09 3.29e-08 ...
#>  $ weighted_var       : num  4888352 724166 367032 374649 1224149 ...
#>  $ sample_var_pers_yrs: num  39355415 34467062 33742896 33375864 33001214 ...
#>  $ sample_var_obs_le  : num  0.00394 0.00347 0.00341 0.00337 0.00334 ...
#>  $ ci_low_95          : num  78.2 77.5 73.6 68.6 63.6 58.8 53.9 49.1 44.4 39.7 ...
#>  $ ci_high_95         : num  78.4 77.8 73.8 68.8 63.9 59 54.1 49.3 44.6 39.9 ...

Renaming columns

The function expects a data.frame with a minimum of three columns:

  • age categories (age_cat): this should be a character variable with the start age and end age separated by a -. Ex: c("0", "1-4", "5-9", "10-14", ..."90+").
  • death counts (deaths): this should be a numeric variable with the number of deaths that occurred in each age group for each interval
  • population years (population): this should be a numeric variable with the number of population years in each group for each interval

If your variables aren’t named "age_cat", "deaths" and "population", you can pass your variable names to the function using the age_cat_var, deaths_var and population_var arguments. For example:

data2 <- data.frame(
  ages = c("0", "1-4", "5-9", "10-14", "15-19","20-24", "25-29", "30-34", "35-39", "40-44",
  "45-49", "50-54", "55-59", "60-64", "65-69", "70-74", "75-79", "80-84", "85-89", "90+"),
  death_count = c(206, 37, 23, 23, 105, 162, 268, 314, 413, 584, 954,
            1359, 1912, 2824, 4507, 5851, 7117, 8192, 7745, 6442),
  population_years = c(50698, 215400, 280458, 258105, 282062, 329060, 306097, 274544, 260415,
           267450, 311314, 324311, 296825, 271339, 284608, 228062, 162785, 111263, 58987, 26016))

life_table1 <- make_life_table(data2,
                               age_cat_var = "ages",
                               deaths_var = "death_count",
                               population_var = "population_years")
str(life_table1)
#> 'data.frame':    20 obs. of  20 variables:
#>  $ age_cat            : chr  "0" "1-4" "5-9" "10-14" ...
#>  $ deaths             : num  206 37 23 23 105 162 268 314 413 584 ...
#>  $ population         : num  50698 215400 280458 258105 282062 ...
#>  $ start_age          : num  0 1 5 10 15 20 25 30 35 40 ...
#>  $ int_width          : num  1 4 5 5 5 5 5 5 5 5 ...
#>  $ fract_surv         : num  0.1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
#>  $ death_rate         : num  4.06e-03 1.72e-04 8.20e-05 8.91e-05 3.72e-04 ...
#>  $ prob_dying         : num  0.004048 0.000687 0.00041 0.000445 0.00186 ...
#>  $ prob_surv          : num  0.996 0.999 1 1 0.998 ...
#>  $ num_alive_int      : num  100000 99595 99527 99486 99442 ...
#>  $ num_dying_int      : num  404.8 68.4 40.8 44.3 184.9 ...
#>  $ pers_yrs_lived_int : num  99636 398244 497532 497319 496746 ...
#>  $ pers_yrs_lived_past: num  7832451 7732815 7334572 6837040 6339721 ...
#>  $ obs_le_int         : num  78.3 77.6 73.7 68.7 63.8 ...
#>  $ sample_var         : num  7.92e-08 1.27e-08 7.30e-09 8.62e-09 3.29e-08 ...
#>  $ weighted_var       : num  4888352 724166 367032 374649 1224149 ...
#>  $ sample_var_pers_yrs: num  39355415 34467062 33742896 33375864 33001214 ...
#>  $ sample_var_obs_le  : num  0.00394 0.00347 0.00341 0.00337 0.00334 ...
#>  $ ci_low_95          : num  78.2 77.5 73.6 68.6 63.6 58.8 53.9 49.1 44.4 39.7 ...
#>  $ ci_high_95         : num  78.4 77.8 73.8 68.8 63.9 59 54.1 49.3 44.6 39.9 ...

get_le()

Basic usage

Typically you just want the estimated life expectancy at age 0 and you don’t need the entire the life table. You can use the get_le() function on the output of the make_life_table() function for this. Note the get_le() function was written explicitly to work with the output of the make_life_table() function.

get_le(life_table1)
#>   obs_le_int ci_low_95 ci_high_95
#> 1   78.32451      78.2       78.4

Additional options

Multiple groups

get_le() also works for grouped data. Like with the make_life_table() function, you should use the grouping_vars argument to specify the groups:

get_le(life_table2, grouping_vars = "group")
#>        group obs_le_int ci_low_95 ci_high_95
#> 1        phe   78.32451      78.2       78.4
#> 21 simulated   77.99384      77.9       78.1

Excluding confidence intervals

95% confidence intervals are included in the output of get_le() by default. You can elect to just get the estimate by setting include_ci to FALSE.

get_le(life_table1, include_ci = F)
#> [1] 78.32451

Ages other than 0

If you’re interested in the life expectancy at an age group other than 0, you can specify the age with the start_age argument. The start age you specify should be numeric and must be the start of an age group in the data passed through make_life_table().

get_le(life_table1, start_age = 30)
#>   obs_le_int ci_low_95 ci_high_95
#> 8    49.2322      49.1       49.3