2. Methods for producing uncertainty at single year of age
To produce uncertainty intervals for the mid-year population estimates broken down by age and sex, we require simulated values for each of the three components associated with statistical uncertainty also broken down by age and sex. The production of simulated values for internal migration at local authority level uses a model of internal moves by single year of age and sex. Therefore, we already have this level of granularity for internal migration. See Step A and Step B for a summary of the approach that we take for the census base and international migration components. Estimating uncertainty at single year of age in 2012 is a two-stage process. For the first stage (Step A), we produce an initial set of 1,000 simulated values for 2012 uncertainty.
For the census base population, we use parametric bootstrapping to create 1,000 plausible values for each single year of age for each sex. The assumption is that errors are normally distributed and that the variance for each age is taken from the published variances for the respective five-year age band.1 The simulations for international migration are more complicated, but for both immigration and emigration we mirror the methods used to create these components in the production of mid-year population estimates.
Sex and age distributions are imputed for the 1,000 simulated immigrant flows generated for mid-year population estimate uncertainty at local authority level. The method clusters local authorities based on immigrant age and sex distributions, taken from census data. Local authorities are given the mean age and sex distributions for their cluster to produce 1,000 values for single year of age for each sex within local authorities.
Sex and age distributions are imputed for the 1,000 simulated emigrant flows generated for mid-year population estimate uncertainty at local authority level. Mirroring mid-year population estimate processes, local authorities are clustered based on age, sex and citizenship (British or non-British). Three years of International Passenger Survey (IPS) data (current year plus two previous years) for emigrants are used to create age and sex distributions, separately for British and non-British. Local authorities are given the mean age, sex and citizenship distributions for their cluster.
Simulations for each component are combined in accordance with the cohort component methodology:
We age-on the base population by a year.
The second stage (step B) begins with calculating the coefficient of variation for the simulations derived in step A. This is used in parametric bootstrapping to incorporate uncertainty into the population update between Census Day (March 27) and the mid-year (June 30). The bootstrapping assumes that errors are normally distributed around the mean, which for each local authority is taken as the difference between the published census value and the mid-year population estimate. We then repeat step A, using the updated simulated values as the mid-year 2011 population base.
Process for calculating the mid-year 2012 simulations by single year of age, sex and local authority
Step A: Calculate preliminary mid-year 2012 simulations from the March 2011 census simulations
Step A1: Census populations March 2011 simulations by single year of age, sex and local authority
These use published variances for corresponding five-year age group to derive variances by single year of age, as follows:
We then use parametric bootstrapping from the normal distribution ~ N (censusSYOA, SDSYOA) to create 1,000 simulations for the census component for each local authority by single year of age and sex.
Step A2: Mid-2012 natural changes (births minus deaths plus minor adjustments) internal inflow simulations
These are already by single year of age, sex and local authority.
Step A3: Mid-2012 internal inflow simulations
These are already by single year of age, sex and local authority.
Step A4: Mid-2012 internal outflow simulations
These are already by single year of age, sex and local authority.
Step A5: Mid-2012 international immigration simulations
These mirror the methodology used by the Population Estimates Unit to calculate international immigration estimates by age and sex. 2011 Census data on immigrants are used to cluster local authorities with similar age and sex profiles. Sex and age within the international in-migration component for each local authority are imputed based on the mean distributions within the cluster that the local authority has been assigned to.
Step A6: Mid-2012 international emigration simulations
These mirror the methodology used by the Population Estimates Unit to calculate international emigration estimates by age and sex. The 2011 Census is used to cluster local authorities based on sex, age and citizenship (British or non-British). Within each cluster, we use International Passenger Survey (IPS) data to create age, sex and citizenship distributions. British and non-British emigrants are assumed to have different age structures.
Three years of IPS data (current year and previous two years) provide a smoothed (centred average) single year of age distribution by citizenship and sex for each cluster. Sex and age are then imputed for each local authority’s emigration simulations, based on the distribution in cluster that local authority was assigned to.
Step A7: Combine simulations from Steps A1 to A6 to derive preliminary mid-2012 simulations by single year of age, sex and local authority
Step B: Derive 2011 mid-year simulations, then calculate final mid-year 2012 simulations
Step B1: Derive 2011 mid-year population simulations by single year of age, sex and local authority
Update the March 2011 Census simulations to the 2011 mid-year population, by single year of age, sex and local authority, as follows:
i. calculate the coefficients of variation of the mid-2012 simulations, by single year of age, sex and local authority, generated in step A
ii. using parametric bootstrapping, simulate values for the three-month census to mid-2011; this is done by generating values from the normal distribution, using the coefficients of variation calculated in the previous step, while the mean is taken as the difference between the published 2011 mid-year estimate and the 2011 Census estimate by single year of age, sex and local authority (this incorporates uncertainty around these updates)
iii. add the values generated in the previous point (ii) to the census simulations
Step B2: Step A processes A2 to A6, as before
Step B3: Combine simulations from Steps B1 and B2, to produce final mid-2012 simulations by single year of age, sex and local authority
Notes for: Methods for producing uncertainty at single year of age
- This assumes that the coefficient of variation for a single year of age is the same as for their respective five-year age group. Empirical testing suggests this may understate variance for 15-year-olds.
3. Uncertainty intervals
The uncertainty intervals generated by this method reflect known patterns of doubt about population estimation: they are generally wider for men than for women and they are wider at student and young working ages.
Figures 1 and 2 provide examples.
Figure 1: Mid-year estimates with uncertainty for males in Greenwich by single year of age
Source: Office for National Statistics
Download this chart Figure 1: Mid-year estimates with uncertainty for males in Greenwich by single year of age
Image .csv .xls