Table of contents
- Summary of methods
- Overview of currently available estimates
- Population definition
- Methods
- Limitations of the estimates
- Further information and contacts
- Appendix A: Methodology used in census years
- Appendix B: Methodology used to produce revised estimates, mid-2012 to mid-2020
- Appendix C: Administrative datasets used in production of small area estimates
- Appendix D: Simplified ratio change methodology diagram for production of SOA estimates
- Appendix E: Sub-threshold wards
- Cite this methodology
1. Summary of methods
Super Output Area and Output Area population estimates
Super Output Area (SOA) estimates are produced using a ratio change methodology. This method uses change in the population recorded in administrative sources as an indicator of change in the true population, and it is used to produce SOA estimates in intercensal periods. For consistency, Lower layer Super Output Area (LSOA) mid-year population estimates are constrained to Middle layer Super Output Area (MSOA) estimates, which in turn are constrained to local authority estimates.
LSOA population estimates are the starting point for calculating Output Area (OA) estimates. Administrative data sources are used to distribute the population, by single year of age and sex, between each OA within a single LSOA. Special populations (for example, prisoners and armed forces) are treated separately as they are static populations that are not fully included in the administrative data sources used to calculate OA estimates.
Further detail on the production of these estimates is given in Production of Super Output Area population estimates and Production of Output Area population estimates in Section 4.
Health geography population estimates
Integrated Care Board (ICB) population estimates are direct aggregations of LSOA estimates and therefore no detailed method is required for their production.
Further detail is given in Production of health geography estimates in Section 4.
Ward and Parliamentary constituency estimates
Ward and Parliamentary constituency population estimates are based on aggregations of whole OA estimates. OA boundaries are not an exact fit (non-coterminous) for current ward or Parliamentary constituency boundaries and are therefore allocated using a best-fit approach.
Further detail is given in Production of ward and Parliamentary constituency estimates in Section 4.
National Park estimates
National Park population estimates are provided for the exact boundaries of the National Park and therefore cannot be produced by aggregating whole OA estimates. The estimates are produced using a ratio change methodology that uses changes in the population of the wider area around the National Park (based on aggregations of OAs) as an indicator of the change in the true population of the National Park.
Further detail is given in Production of National Park estimates in Section 4.
Back to table of contents2. Overview of currently available estimates
The Office for National Statistics (ONS) produces estimates of the resident population of England and Wales. The most authoritative population estimates are produced every 10 years and are based on the results of the latest census. These are updated annually to produce mid-year population estimates in the intercensal period (referred to as "rolled forward" estimates). The population estimates give a stock count of people living in England, Wales, the regions of England and local authority areas, and the composition of the population in these areas by age and sex. Further population statistics, including migration estimates, vital events (covering births, deaths, marriages and divorces) and population projections are also available. Detailed results from the decennial censuses are available on NOMIS and provide information on the characteristics of the usually resident population, for example, ethnicity and country of birth or marital status, for small areas.
Additionally, the ONS produces population estimates for small areas within England and Wales:
Accredited Official Statistics - Middle layer Super Output Areas (MSOAs) at quinary ages, Lower layer Super Output Areas (LSOAs) at broad ages and health geographies1
Official statistics in development - wards, Parliamentary constituencies and National Parks
- Supporting Information - Output Areas (OAs), MSOAs by single year of age, LSOAs by single year of age and quinary ages
Small area population estimates for mid-2011 to mid-2022 Super Output Areas (SOAs) have been produced on Census 2021 boundaries. Mid-2002 to mid-2011 estimates for wards, Parliamentary constituencies, health geographies and National Parks have been published on boundaries available when they were published following census rebasing.
Estimates have also been produced for mid-2001 to mid-2011 (on 2011 census output areas) and mid-2011 to mid-2022 (on 2021 census output areas), as these are used to form estimates for wards and Parliamentary constituencies.
Small area population estimates are produced using the best methods and data sources currently available. The 2011 Census provided an opportunity to benchmark these estimates against census data and to analyse the level of accuracy that has been achieved. A report entitled Small Area Population Estimates (SAPE) Evaluation: Report on Accuracy Compared to Results of the 2011 Census compared "rolled forward" SOA estimates for mid-2011 (based on 2001 Census data) with 2011 Census-based SOA estimates for mid-2011 and was published on 6 November 2015. This analysis shows how well the ratio change methodology has performed in estimating small area populations over the intercensal period.
Back to table of contents3. Population definition
The population base from Census 2021 underpins the mid-year population estimates base and is defined as follows:
The mid-year population estimates are consistent with the standard UN definition for population estimates. The UN definition is based on the concept of usual residence and includes people who reside, or intend to reside, in the country for at least 12 months, whatever their nationality. Visitors and short-term migrants (who enter or leave the UK for less than 12 months) are not included.
Students are taken to be resident at their term-time address.
Members of His Majesty's armed forces stationed in England and Wales are included at their place of residence, but those stationed outside England and Wales are excluded. Members of the US armed forces and their dependants stationed in England and Wales are included.
In order to ensure that members of the armed forces were enumerated consistently, Census 2021 was designed so that members of the armed forces were enumerated at their "permanent or family home" (this is considered to be their usual residence for census purposes). The mid-year estimates' definition of usual residence for armed forces is different as it may be either their "permanent or family home" or the armed forces base, depending on individual circumstances. For the purposes of calculating mid-year population estimates, an adjustment has been applied to Census 2021 data at Output Area (OA) level to reallocate members of the home armed forces from their "permanent or family home" to their place of residence at the armed forces base, where these are different.
Before 2021, prisoners had been regarded as usually resident at an institution if they had been sentenced to serve six months or more. For 2021 onwards, the definition has been updated to be consistent with that used in Census 2021. That is, prisoners:
- serving 12 months or more
- on convicted unsentenced remand
- with indeterminate sentences
- who are non-criminals, on recall and those with sentence length not recorded
In practice, when producing a population estimate, a number of data sources have to be used, each with its own definition of usual residence. Further detail on the administrative datasets used to produce small area population estimates can be found in Appendix C.
Back to table of contents4. Methods
This section details the ratio change (Super Output Area (SOA)) and apportionment approach (Output Area (OA)) methodologies used to produce small area population estimates in intercensal years. In a census year, population estimates are produced from census estimates and are aged forward and adjusted to account for differences between census day and the mid-year (30 June) and usual residence definitions. Following a census, rolled-forward estimates for the intercensal years are revised to account for population change over this period and to provide a consistent time series of population estimates. More details on the one-off methods used to produce estimates in census years and make the revisions to the estimates for the period mid-2012 to mid-2020 are included in Appendix A and Appendix B respectively.
Production of Super Output Area population estimates
The estimates for each year are produced using a ratio change methodology. Prior to the mid-2013 estimates, a number of different administrative datasets were used in the production of these estimates and these are documented in Appendix C. From mid-2013 to mid-2020 only the patient register data were used, while from mid-2021 onwards only the Personal Demographics Service (PDS) has been used.
The description of the methodology in the following sections uses the example of creating rolled-forward mid-2022 Lower layer Super Output Area (LSOA) estimates. For example, we produced the mid-2022 LSOA estimates using the mid-2021 LSOA estimates as the population base.
Middle layer Super Output Area (MSOA) estimates are created in a similar manner using derived MSOA quinary age-by-sex change ratios and are constrained to local authority mid-year estimates.
Step-by-step guide to methodology
The estimates were produced by applying the ratio change method to an LSOA estimate of the population base (the mid-2021 LSOA estimates) using PDS data. This issue is discussed in more detail in the Quality and Methodology Information (QMI) report.
Before applying these change ratios, some population counts are subtracted (referred to as the special population). These comprise UK armed forces, foreign armed forces and dependants, and prisoners. They are added again after these counts are constrained to the 2022 local authority mid-year estimates minus the special population.
The main assumption behind this ratio change method is that for each area, the data should have a consistent relationship with the true population over time.
Change ratios were calculated by quinary age group and sex for the PDS data. The change ratios are calculated by dividing for each dataset the mid-2022 count by quinary age and sex by the mid-2021 count by quinary age and sex. For example, a mid-2022 count of 50 divided by a mid-2021 count of 40 gives a change ratio of 1.25.
In summary:
1. The ratios are then applied to the mid-2021 LSOA population minus the mid-2021 special population by quinary age and sex.
2. The estimates are then constrained to the mid-2022 MSOA estimates (less mid-2022 special population), which have been constrained to the local authority mid-2022 estimates less mid-2022 special population.
3. The mid-2022 LSOA estimates by single year of age and sex are produced by apportioning the quinary age counts to single year of age using mid-2022 local authority constrained patient register single year of age and sex counts.
4. The mid-2022 LSOA estimates by single year of age and sex are then constrained for consistency to mid-2022 MSOA estimates by single year of age and sex (these counts are derived from mid-2022 MSOA quinary age and sex estimates created using the same ratio change methodology as for LSOAs, apportioned to single year of age and sex using mid-2022 local authority constrained PDS counts by single year of age and sex).
5. Updated mid-2022 special population counts are then added back in to the quinary and single year of age and sex counts, to give mid-2022 LSOA estimates by quinary and single year of age and sex.
Any change ratios produced for counts containing a zero by quinary age and sex were changed to one to ensure a valid ratio was produced. Where change ratios were applied to a base population by quinary age and sex of zero, the base population was changed to one to ensure an actual population count in the estimate, otherwise counts of zero in the base population would forever remain at zero.
Where SOA data for any of the datasets were identified to be erroneous, the calculated change ratios were updated to correct for identified errors. Criteria were developed to assist in the consideration of making any changes to the originally calculated change ratios.
An illustrative diagram of the ratio change method is shown in Appendix D.
Production of Output Area population estimates
OA estimates by age and sex are the building blocks used to form estimates for wards and Parliamentary constituencies using a best-fit method. OA estimates are consistent with estimates for higher geographies, such as SOAs, local authorities, and the national total for England and Wales. The description of the methodology that follows uses the example of creating rolled-forward mid-2022 OA estimates.
Step-by-step guide to methodology
Mid-2022 LSOAs are the starting point for calculating mid-2022 OA estimates. OA estimates are produced using an apportionment approach. The following stages are followed to produce mid-2022 OA estimates:
1. Create mid-2022 LSOA estimates, by single year of age and sex.
2. Remove special populations.
Special populations (prisoners and armed forces) are removed from the mid-2022 LSOA estimates. These populations are treated separately as they are static populations, at known locations, that are not included in the administrative data sources used to calculate OA estimates.
3. Apply PDS distribution.
The distribution of population between each OA in a single LSOA can be determined from administrative data sources. The number of patients on the PDS in each OA at the mid-year point is used as a proxy for the true size of the population at the same point in time.
For example:
LSOA x, made up from five OAs x1,..., x5, has 20 males aged 0 years and 15 females aged 0 years at mid-2022. The PDS distribution of 0-year-olds across the five OAs x1,..., x5 is shown in Table 1.
OA | Males aged 0 | Females aged 0 | ||
---|---|---|---|---|
Count | % | Count | % | |
x1 | 6 | 20 | 2 | 10 |
x2 | 3 | 10 | 4 | 20 |
x3 | 6 | 20 | 5 | 25 |
x4 | 12 | 40 | 8 | 40 |
x5 | 3 | 10 | 1 | 5 |
Total | 30 | 100 | 20 | 100 |
Download this table Table 1: Personal Demographics Service distribution
.xls .csvThe mid-2022 estimates for 0-year-olds in the five OAs are therefore given by the percentages shown in Table 1 multiplied by the mid-2022 estimate for the parent LSOA. The example results are shown in Table 2.
OA | Males aged 0 | Females aged 0 | ||||
---|---|---|---|---|---|---|
LSOA Total | % | OA Total | LSOA Total | % | OA Total | |
x1 | 20 | 20 | 4 | 15 | 10 | 1.5 |
x2 | 20 | 10 | 2 | 15 | 20 | 3 |
x3 | 20 | 20 | 4 | 15 | 25 | 3.75 |
x4 | 20 | 40 | 8 | 15 | 40 | 6 |
x5 | 20 | 10 | 2 | 15 | 5 | 0.75 |
Download this table Table 2: Example results
.xls .csv4. Rounding and constraining
The resulting estimates for each Output Area (OA) by single year of age and sex are then rounded to ensure estimates of whole persons and constrained to the mid-2022 Lower layer Super Output Area (LSOA) estimates by single year of age and sex. This process ensures that these estimates are fully consistent with mid-year population estimates for all higher geographies.
Production of health geography estimates
The Integrated Care Boards (Establishment) Order 2022 introduced a new structure for NHS organisations that replaced Clinical Commissioning Groups (CCGs) with Integrated Care Boards (ICBs) from 1 July 2022. ICB areas are formed from groups of LSOAs and therefore ICB population estimates are created by directly aggregating LSOA estimates.
ICBs are organised into the higher level of health geography of NHS England (7 regions). The regions are formed from groups of ICBs, therefore population estimates for these areas are also created by directly aggregating LSOA estimates.
Production of ward and Parliamentary constituency estimates
Mid-year OA estimates are directly aggregated to produce ward and Parliamentary constituency estimates by single year of age and sex. This is achieved by using the published OA-to-ward and OA-to-Parliamentary-constituency geography lookups, which are available from the Open Geography Portal. These lookups allocate OAs to higher-level geographies using a best-fit method. . For each OA, a single fixed point is established that represents how the population is spatially distributed within the OA. These points are called population-weighted centroids and are calculated algorithmically based on Census 2021 estimates. The allocation of OAs to wards and Parliamentary constituencies is based on where this point falls. Prior to 2011, ward and Parliamentary constituency estimates were produced using a postcode best-fit method. Further details on this method are available under methods and guidance on our archive website.
In England and Wales, there were 7,666 electoral wards as at 31 December 2022. Census 2021 treated the 25 electoral wards of the City of London and the 5 electoral wards of Isles of Scilly local authorities as single wards, not made up of multiple wards. Best fitting of output areas to the 31December 2023 set of electoral wards for City of London and Isles of Scilly provides estimates for some wards within these areas. However, 16 wards are smaller than OAs and they will not have separate OA estimates attached to them. In these cases, neighbouring wards have assumed the populations as detailed in Appendix E.
Production of National Park estimates
The Office for National Statistics (ONS) Geography policy states that statistics for higher-level geographies should be built from OA statistics using a "best-fit" allocation. National Parks are an exception to this, as they cannot be suitably estimated through best-fitting of OAs. As such, estimates for National Parks for Census 2021 were produced on an exact-fit basis; however, this process cannot be repeated for mid-year population estimates.
Estimates for mid-2021 onwards
A ratio change method was used to roll forward the published Census 2021 National Park estimates (by 5 year age and sex groups) to produce mid-2021 estimates. This approach uses the population growth of the wider National Park area as a proxy for the change within the National Park boundary. These wider areas are the groups of OAs that have postcodes lying within National Park boundaries (for example, Figure 1), as determined by the National Statistics Postcode Lookup (NSPL).
Figure 1: Output Areas within Exmoor National Park
Source: Office for National Statistics – Methodology note
Download this image Figure 1: Output Areas within Exmoor National Park
.png (58.4 kB)The same approach was then used to roll forward mid-2021 estimates to produce mid-2022 estimates.
Step-by-step guide to methodology
The following stages are followed to carry out the ratio change methodology to produce the National Park population estimates for each year. The example given is for producing mid-2021 rolled forward from Census 2021, but the process is analogous for all years.
1. Create wider National Park areas from OAs.
The NSPL lists all current and previous postcodes and the higher geographies in which they are deemed to lie. This is used to create a lookup of OAs that fall wholly or partly within National Parks to form wider-National Park areas.
2. Calculate proportional change for rolling forward.
Population estimates for these wider areas are created for Census 2021 and mid-2021 by aggregating the OA estimates for those periods. The proportional population change for each group is calculated for each area. That is:
change (males aged zero years) = mid-2011 (males aged zero years) / census-2011 (males aged zero years)
3. Rounding and constraining.
The resulting estimates by National Park, sex and age are not integers. Using sumsafe rounding in python we round the non-integer values to integers, while preserving the initial estimates by National Park, sex and five year age band.
Statistical disclosure control of estimates
The disclosure control processes applied to the estimates include small adjustments made to selected cells. Adjustments are made in such a way that inference of an underlying count is not possible but that the usefulness of the aggregated estimates is not materially affected.
Back to table of contents5. Limitations of the estimates
The estimates have been produced using administrative data sources to identify annual population change in intercensal years. Any deficiencies in these data sources may therefore impact upon the quality of the estimates produced. Where known deficiencies have been identified, corrective measures have been applied. However, other deficiencies in the use of administrative data sources for producing population estimates may be less apparent, for example, list cleaning of the patient registers.
Small area population estimates were initially intended for publication by five-year age group and sex. More detailed estimates have since been provided by single year of age and sex. These are intended to enable and encourage further analysis and use of the estimates. Particular caution should be exercised in using estimates at a greater level of disaggregation, for example, for Output Areas (OAs) or for single year of age groups, as these would not be expected to have the same level of accuracy as the aggregated estimates.
Mid-2012 to mid-2020 estimates
Revised mid-2012 to mid-2020 estimates have been produced to provide a consistent time series of population estimates for the intercensal period. A method was put in place that reconciled the rolled forward LSOA estimates to the mid-2021 Census-based LSOA estimates. This method balanced the need to produce a plausible revised back-series for total populations and sensible age-sex distributions that can be used for producing best-fit population estimates and bespoke population groups.
One limitation of this method is that it relies on assuming how the difference between the two sets of estimates for mid-2021 has developed over time. This assumption will be particularly important for OAs or LSOAs where Census 2021 estimates were very different from the rolled-forward estimates.
The differences are distributed back across the LSOA back-series by age-sex and then estimates are constrained upwards to the revised local authority mid-year estimates. This process ensures that LSOA-level estimates are consistent with local authority estimates. Care should be taken in interpreting age distributions for areas affected by this issue.
Mid-2011 and Mid-2021 estimates
For mid-year estimates based on the 2011 and 2021 Census, no explicit adjustments were made for either internal or international in- or out-migration in the period from census day to 30 June (mid-year). However, an adjustment will have been made through the constraining to the local authority estimates that will have included these components. This therefore assumes that population change resulting from migration by age-sex (in- and out- migrants) in this period in all Super Output Areas (SOAs) within a local authority is proportional to its size in those age-sex groups. This is unlikely to be true. However, as this estimate is for a short time, it is unlikely that the differences will be large.
Revised mid-2002 to mid-2010 estimates
Revised mid-2002 to mid-2010 estimates have been produced to provide a consistent time series of population estimates for the intercensal period. A method was put in place that reconciled the rolled forward LSOA estimates to the mid-2011 Census-based LSOA estimates, balancing the need to produce a plausible revised back-series for total populations and sensible age-sex distributions that can be used as building blocks for producing best-fit population estimates and bespoke population groups. The revisions also include a number of corrections for known issues with the previous series of estimates. The most important of these were:
- an adjustment to correct for under estimation of foreign armed forces in Harrogate local authority in the mid-2009 to mid-2010 estimates
- an adjustment to correct for boundary changes in Neath Port Talbot and Powys between mid-2005 and mid-2009
Research undertaken prior to the publication of the revised mid-2002 to mid-2010 small area population estimates identified three possible methods to produce a back-series of population estimates. A "full assessment method" using census and administrative data along with an individual consideration of each area, where required, would have resulted in more accurate estimates overall. However, the advantages of increased accuracy were weighed against the impact on timeliness (that is, how soon the estimates could be published). Here there was a trade-off between different aspects of the quality of the estimates.
The "distribution of the difference" method provided the best balance in the majority of small areas between producing a plausible back-series of population estimates for each individual area and using a relatively straightforward method to allow timely publication. The method was designed to identify the difference between the census-based and rolled-forward mid-2011 estimates for each OA and LSOA and to distribute this difference across the back-series in order to remove any "jump" in the estimates between mid-2010 and mid-2011. Consequently, the patterns of change identified in administrative data using the ratio change method may not be maintained in the revised mid-2002 to mid-2010 figures.
One of the limitations of this method are that it relies on assuming how the difference between the two sets of estimates for mid-2011 has developed over time. This assumption will be particularly important for OAs or LSOAs where the 2011 Census estimates were very different from the rolled-forward estimates.
As the difference is distributed across the OA and LSOA back-series by age-sex cohort, an implicit assumption is also made that populations in mid-2011 would have been in an area in 2002 at a younger age (that is, a 19-year-old male in mid-2011 would have been in the same area in 2002 but aged 10). This was a particular issue in LSOAs with high student-aged populations. Constraining the LSOA estimates to the revised subnational mid-year estimates will have corrected for this to a certain degree; however, a minority of LSOAs show very small counts at younger ages as a result of this assumption. Care must be taken in interpreting age distributions for areas affected by this issue.
Back to table of contents6. Further information and contacts
We welcome user feedback on the population estimates. To provide feedback or to request further information, please contact the Demography team either by email at pop.info@ons.gov.uk or by telephone on +44 (0) 1329 444661.
Back to table of contents7. Appendix A: Methodology used in census years
Mid-2021 Super Output Areas
Estimates of the resident population as at mid-2021 have been produced for publication by quinary age group and sex, for Lower layer Super Output Areas (LSOAs) and Middle layer Super Output Areas (MSOAs). Data by single year of age will also be available. For consistency, the mid-2021 LSOA population estimates, by age and sex, are constrained to the mid-2021 local authority estimates. MSOA estimates are produced by directly aggregating the LSOA estimates.
Step-by-step guide to methodology
The following stages were followed to produce mid-2021 LSOA estimates:
Unadjusted Census 2021 LSOA population estimates by single year of age and sex for the usually resident population were aged forward from 21 March 2021 (Census Day) to 30 June 2021.
These aged-forward estimates were adjusted to account for differences in armed forces usual residence definitions between census and mid-year estimates (see Production of Output Area population estimates in Section 4). The census base data by single year of age described here was rounded to the nearest 5.
Births occurring from 21 March 2021 to 30 June 2021 were added. Nationally, this added 171,000 babies (zero-year-olds).
Deaths occurring between 21 March 2021 and 30 June 2021 were subtracted. Nationally, this reduced the estimates by 135,000 persons.
The LSOA estimates by single year of age and sex were constrained to the mid-2021 local authority mid-year estimates. This constraining is required because the mid-2021 local authority estimates include adjustments for internal and international migration (see Production of ward and Parliamentary constituency estimates in Section 4).
These LSOA estimates by sex and single year of age were then aggregated to produce MSOA estimates and estimates for both LSOAs and MSOAs by quinary age group.
Mid-2021 Output Areas and other geographies
Step-by-step guide to methodology
The following stages were followed to produce mid-2021 Output Area (OA) estimates that are then aggregated to produce mid-2021 ward and Parliamentary constituency estimates on a best-fit basis.
Produce estimates for mid-2021 OAs, by single year of age and sex
For each LSOA:
1. Establish the component OAs that make up each LSOA (in this example, LSOA X consists of four OAs: X1, X2, X3 and X4).
Figure 2: LSOA X
Source: Office for National Statistics – Methodology note
Download this image Figure 2: LSOA X
.png (4.2 kB)2. For each ageband and sex of the adjusted Census 2021 LSOA estimate, calculate the contribution from each component OA. For example, 25% of males aged zero to four years in the adjusted Census 2021 estimate for LSOA X are in OA X1, 15% are in OA X2 and so on.
This is the "2021 Census OA-LSOA population distribution".
3. Apply the "2021 Census OA-LSOA population distribution" to the mid-2021 LSOA estimates, by ageband and sex, to derive mid-2021 OA estimates (also by ageband and sex).
In the example for LSOA X, 25% of the mid-2021 estimate for males aged zero, one and two years is allocated to OA X1, 15% to OA X2 and so on. This process is then repeated for all age bands and sex groups.
4. Round the resulting OA estimates to whole persons while maintaining totals to mid-2021 LSOA estimates.
Aggregate the mid-2021 OA estimates to produce ward and Parliamentary constituency estimates
The final step is to aggregate the mid-2021 OA estimates to produce 2021 ward and Parliamentary constituencies estimates by single year of age and sex. This was done by using the published OA to ward and OA to Parliamentary constituency geography lookups. Allocation of OAs was determined using a best-fitting method, produced on the basis of using OAs as building blocks to estimate higher geographies.
Mid-2001 Super Output Areas
The findings of our research that reviewed evidence on the 2001 Census estimates indicated that while no single piece of evidence on its own was conclusive, the weight of evidence suggested that the 2001 Census did not cover all people in England and Wales, particularly young adult men. Accordingly, the 2001 local authority mid-year estimates were revised in September 2004 to reflect this evidence, including:
- adjustments for missing census forms
- longitudinal study adjustments
- the Manchester and Westminster Matching Studies
- 2004 local authority studies
These adjustments resulted in a national count for mid-2001 that was 318,000 greater than the 2001 Census count. The methodology used to produce mid-2001 Super Output Area (SOA) population estimates reflects the population adjustments incorporated in the local authority mid-year estimates.
Estimates of the resident population as at mid-2001 were produced for publication by broad age group and sex for LSOAs and quinary age group and sex for MSOAs. For consistency, the mid-2001 LSOA population estimates, by age and sex, are constrained to the mid-2001 MSOA estimates, which in turn are constrained to the 2001 local authority mid-year estimates.
The methodology used to create these mid-2001 SOA estimates was quality assured and approved by the ONS Methodology Directorate.
Step-by-step guide to methodology
The derived mid-2001 LSOA population base produced reflects the methodology used to create the revised mid-2001 national and local authority estimates but down to the smaller LSOA geography. In summary, to produce the mid-2001 LSOA, these stages were followed:
1. Unadjusted census OA counts by single year of age and sex for the usually resident population were aggregated to LSOAs.
2. To these counts, an adjustment was made for missing census forms, adding 5,100 people in 21 LSOAs.
3. The population counts by single year of age and sex were aged forward from 29 April 2001 to 30 June 2001.
4. To these counts, the longitudinal study local authority population revisions to males aged 21 to 50 years in the revised mid-2001 local authority revised estimates were added. Nationally, 163,800 persons were added. These were disaggregated to 6,647 LSOAs, using an adaptation of the methodology used to derive local authority male counts. We have published the methodology used to make the local authority revisions (PDF, 34.5KB) .
5. To these counts, 2,453 LSOA adjustments for 15 local authorities resulting from the local authority studies review were added. Nationally, this was 107,400 persons.
6. Births (zero-year-olds) occurring between 1 May 2001 and 30 June 2001 were added. Nationally, this added 100,900 babies.
7. Deaths occurring between 1 May 2001 and 30 June 2001 were subtracted. Nationally, this reduced the estimates by 84,100 persons.
8. The single year of age and sex counts were constrained to the revised 2001 local authority mid-year estimates (published in September 2004) and to reflect a population change of around 1,100 resulting from April 2003 boundary changes in three Welsh unitary authorities: Carmarthenshire, Ceredigion and Pembrokeshire.
MSOA estimates were produced by aggregating the LSOA counts. To address comments received following the user consultation with the previously published ward-level population estimates, further adjustments were incorporated into these LSOA and MSOA estimates as follows:
1. Adjustments for identified under or over estimation at MSOA level were applied to the component LSOAs.
2. These adjustments reflect MSOAs where local authority census response was poor and where the MSOA estimates following longitudinal study adjustments differed significantly to counts from administrative data sources. The adjustments resulted in some LSOAs having a reduction in population and others an increase. Net change at local authority level was zero. Adjustments were made to 818 LSOAs within 173 MSOAs in 13 local authorities. A list of these 13 local authorities is shown in Appendix B.
3. Owing to inconsistencies between census home armed forces counts and mid-2001 Defence Analytical Services Agency (DASA) local authority counts, negative counts were produced by single year of age when the mid-2001 special population was deducted from the resident population to produce the mid-2002 estimates. To overcome this, adjustments were made for 51 LSOAs within 43 MSOAs in 33 local authorities that had negative counts when the special population was subtracted. Compensating adjustments were spread across all other LSOAs and MSOAs within these local authorities. Net change at local authority level was zero.
4. Owing to inconsistencies between census prisoner counts and mid-2001 Home Office prisoner counts, negative counts were produced by single year of age when the mid-2001 special population was deducted from the resident population for the production of mid-2002 estimates. To overcome this, adjustments were made for six LSOAs within six MSOAs in six local authorities that had negative counts when the special population was subtracted. Compensating adjustments were spread across all other LSOAs within these local authorities. Net change at local authority level was zero.
Additional adjustments
Further changes to the LSOA mid-2001 population were made to reflect changes in the allocation of postcodes to LSOAs. Postcodes that existed at the time of the 2001 Census, which were split by electoral ward boundaries, were assigned to a single census OA on the basis of where the majority of the population lived. Other postcodes in use prior to April 2001, and new postcodes created after then, were assigned to a census OA using the grid reference of the address closest to the mean of the easting and northing for each postcode. As a result, some postcodes were allocated to a different census OA than would have been the case if using a grid reference allocation. A new methodology was introduced with the August 2006 National Statistics Postcode Lookup (NSPL) to resolve this.
All postcodes in England and Wales are now assigned by a point-in-polygon process (that is, plotting grid references and assigning to digital boundaries by geographic information system using current grid references. As the grid reference for each postcode will be current and the geography allocations will be directly based on it, the two will always correspond).
The implication of this change for small area population estimates is that the population residing within addresses for some postcodes will have a changed LSOA allocation. Therefore postcodes included in a particular LSOA at the time of the 2001 Census will now be allocated to a different LSOA, affecting the mid-2001 base population estimates.
The postcodes and LSOAs that were affected the most in terms of overall population numbers were identified. Some LSOAs will have been overestimated at mid-2001 and others similarly underestimated. The postcodes investigated that had a changed LSOA allocation with the May 2006 NSPL were those that from the 2001 Census had a population count of 100 or more and that were previously assigned to a single census OA. They were therefore also the postcodes where the impact of possible adjustment would result in the greatest improvement to the estimates. A 2001 Census population count of 100 was therefore the threshold considered for base population adjustment, and 76 such postcodes were identified.
A visual check was then done using Ordnance Survey digital mapping to check that the new LSOA allocation was correct. In some cases, it was identified that the original postcode to LSOA allocations were in fact more accurate. LSOA population adjustments for mid-2001 were made for 49 postcodes with an associated population count of 100 or more; this affected 88 LSOAs in 62 MSOAs in 40 local authorities. For all of these postcodes, the local authority allocation was unchanged.
When constraining to the local authority mid-year estimates, special account was taken of the population change of around 1,100 persons (as at mid-2001) resulting from April 2003 boundary changes in three Welsh unitary authorities: Carmarthenshire, Ceredigion and Pembrokeshire. The SOA counts for these local authorities if aggregated to local authority level will not be consistent with the published local authority mid-year estimates, as these reflect the local authority boundaries as at mid-2001 whereas the SOA boundaries reflect the geography at the time of the April 2003 boundary changes.
Mid-2001 estimates for other geographies
Mid-2001 population estimates for all other geographies (including OAs, wards and Parliamentary constituencies) were produced using the postcode best-fit (PBF) method. Details of this method are available from the Population estimates methodology archive.
Back to table of contents8. Appendix B: Methodology used to produce revised estimates, mid-2012 to mid-2020
Super Output Areas
The revised estimates of the resident population between mid-2012 and mid-2020 have been produced for publication by single year of age and sex, for Lower layer Super Output Areas (LSOAs) and Middle layer Super Output Areas (MSOAs). For consistency, the revised mid-2012 to mid-2020 LSOA population estimates, by single year of age and sex, are constrained to the revised mid-2012 to mid-2020 local authority estimates. MSOA estimates are produced by directly aggregating the LSOA estimates.
Step-by-step guide to methodology
The following stages were followed to produce the revised mid-2012 to mid-2020 LSOA estimates:
Mid-2012 to mid-2021 rolled-forward LSOA population estimates (based on the 2011 Census Output Areas (OAs)) by single year of age and sex for the usually resident population were mapped and apportioned to the new 2021 LSOA boundaries using 2011 to 2021 geography lookup files. See our information on lookups between 2011 and 2021 geographies.
The difference between the mid-2021 Census-based and the mid-2021 rolled-forward LSOA population estimates was calculated by single year of age and sex.
The difference at mid-2021 was distributed cumulatively across the mid-2012 to mid-2020 LSOA rolled-forward series by age-sex.
The revised mid-2012 to mid-2020 LSOA estimates were then constrained to the previously published revised census-based local authority estimates for mid-2012 to mid-2020.
These LSOA estimates by single year of age and sex were then aggregated to produce MSOA estimates.
National Parks
Revised mid-2012 to mid-2020 OA estimates were used to produce a consistent time series of National Park population estimates for the intercensal period. The OAs are used to form a time series of estimates for the wider National Park areas (see Production of National Park estimates in Section 4, allowing the annual population changes to be established. The changes in these wider areas are initially used to roll forward National Park estimates for 2011 to produce a National Park time series to mid-2021. The method for rolling forward uses the ratio change approach described in Production of National Park estimates in Section 4.
The estimates rolled forward to mid-2021 are compared against the census based mid-2021 National Park estimates. Differences are rolled backwards across the decade to produce a set of National Park estimates that are consistent with both the mid-2011 and mid-2021 National Park estimates.
Back to table of contents9. Appendix C: Administrative datasets used in production of small area estimates
Following an evaluation of a number of data sources, the following administrative datasets were used with the ratio change method to produce the Super Output Area (SOA) estimates.
Personal Demographics Service (mid-2022 onwards)
For production of mid-2022 small area estimates onwards the Personal Demographics Service (PDS) replaced the patient register as the administrative data used for ratio change. The PDS includes everyone on the patient register but may include extra records that would be excluded from the usually resident population, such as patients who have died but not yet been removed and non-residents (for example, short-term visitors) who have accessed a PDS-supported system (such as A&E) but who may not go on to register with a GP. As part of our pre-processing we remove records who do not meet our inclusion criteria and this results in a similar dataset to an equivalent patient register.
Patient register (up to mid-2020)
Year-specific July counts of individuals included on the NHS patient registers were used, that is, persons registered with a doctor, by single year of age and sex at postcode level. Through the use of a postcode lookup table (for example, the National Statistics Postcode Lookup (NSPL)) these counts can be aggregated to different geographical levels such as wards and SOAs.
The data provided to us have already been validated, so only records with a valid postcode are received. Improvements in the quality of postcodes on the patient registers reflect the efforts by the strategic health authorities (in England) and health boards (in Wales) to improve the quality of the data on their registers.
Some of our previous research, describing the use of patient registers to measure internal migration in England and Wales, concluded that at the local authority level, patient register counts do not provide a reliable estimate of the resident population of England and Wales. It was also noted that for the patient registers to be used in this way, the counts would require significant adjustments and further research would be needed.
When the patient register count exceeds the mid-year population estimates, this is often referred to as "list inflation". This may occur when some patients have more than one NHS number and are double counted, and patients may be on doctors' lists after having left the country. List inflation may also be localised, for example, in student areas where students do not quickly re-register after finishing their course of study and moving away from an area.
List inflation is greatest in London boroughs. It is expected that this list inflation is mainly caused by a high number of international migrants moving into London and registering with a GP. Many of them will not be removed from the GP lists when they leave.
Conversely, there are other areas where the patient registers will be missing individuals because these persons are ineligible for registration with a GP. This has predominantly been in areas where there is known to be a high armed forces population. This group has not generally been registered with an NHS GP. Other groups of the population that are excluded from the NHS are prisoners that are sentenced for a term of two years or more and certain patients in long-stay medical hospitals. In addition, there are individuals who obtain all their medical care privately who may not be registered with a GP, but these numbers are likely to be small.
Child Benefit data (up to mid-2012)
Child Benefit data were not used in the production of SOAs from mid-2013, as noted in Step-by-step guide to methodology in Section 4.
The Department for Work and Pensions (DWP) previously administered a Child Benefit database containing details about children for whom Child Benefit was claimed, along with the claimant's particulars (usually the mother). The claimant's details were therefore repeated in as many records as she or he has children. In 2003, responsibility for Child Benefit passed to Inland Revenue (IR), now HM Revenue and Customs (HMRC).
Child Benefit counts for April 2001 and August 2002 were received from the DWP, June 2003 counts from IR and August counts for subsequent years from HMRC. Mid-year counts were not available. While data were held by DWP, IR or HMRC for each child for whom Child Benefit was claimed, because of social security and data protection legislation, DWP, IR and HMRC were unable to give us access to identifiable, subject-level data. Through an arrangement with us and DWP, the University of Oxford carried out an LSOA-level aggregation of the 2001 and 2002 datasets. While Child Benefit may be claimed for children aged 0 to 16 years and over, only counts for children in the quinary age groups of 0 to 4, 5 to 9 and 10 to 14 years were used to calculate change ratios. Eligibility for benefit decreases for children older than 15 years.
There are a number of valid reasons why we would expect the count of children aged 0 to 14 years from Child Benefit to be lower than the national mid-year estimates, including:
- dependants of students and foreign nationals, including foreign armed forces, are not eligible for Child Benefit
- children in local authority care or foster care are not eligible for Child Benefit
- children detained in secure or non-secure accommodation are not eligible for Child Benefit
- children whose entry to the UK is subject to immigration control are not eligible for Child Benefit
- children for whom Child Benefit is not claimed but who are eligible
There are also valid reasons why we would expect the distribution (location) of children from Child Benefit to differ from the mid-year estimates, these include:
- school boarders - where claimant's address is different to boarder's residential address
- children who reside at a different address to the address of the claimant
UK armed forces (special population)
The armed forces component of the local authority usually resident population estimates includes UK armed forces, covering personnel in the Royal Air Force, British Army and Royal Navy stationed in England and Wales. The numbers are collected annually from the Ministry of Defence (MoD) which provides data on the number of UK armed forces by gender stationed at each base in England and Wales. These data are mainly based on administrative systems that are used
The MoD also provides an age and sex breakdown for all UK armed forces, which is used to help derive an age and sex breakdown of armed forces personnel at local authority level. The MoD are currently unable to provide counts of armed forces personnel on a usually resident basis; residence information is available from the census.
Foreign armed forces (special population)
The foreign armed forces component of the local authority usually resident population estimates covers personnel in the United States Air Force, United States Army and United States Navy. Personnel from the United States Air Force make up the great majority of foreign armed forces stationed in England and Wales.
The age and sex distribution of these personnel is not available annually and an age and sex distribution is applied using census data. The data sources for the three armed forces services differ, and the format of data received differs slightly between sources. For example, data on United States Air Force personnel are only available by location (town), while data on United States Army personnel are available at postcode level and data on United States Navy personnel are available at postcode-sector level.
Prisoners (special population)
Data are received from the Ministry of Justice on the number of prisoners for inclusion in the local authority mid-year estimates. Age and sex information for each prisoner for all prisons is received. For population estimates prior to mid-2011, prisoners are regarded as usually resident in a prison if they have been sentenced and have served six months or more of their sentence. For estimates from mid-2011 to mid-2020, prisoners are regarded as usually resident in a prison if they have been sentenced to six months or more.
For 2021 onwards the definition has been updated to be consistent with that used in Census 2021. That is prisoners:
- serving 12 months or more
- on convicted unsentenced remand
- with indeterminate sentences
- who are non-criminals, on recall and those with sentence length not recorded
10. Appendix D: Simplified ratio change methodology diagram for production of SOA estimates
Figure 3: Simplified ratio change methodology diagram for production of SOA estimates
Source: Office for National Statistics – Methodology note
Download this image Figure 3: Simplified ratio change methodology diagram for production of SOA estimates
.png (27.9 kB)Notes for Appendix D: Simplified ratio change methodology diagram for production of SOA estimates
Special population adjustments are as follows (affecting some age groups only): UK armed forces, foreign armed forces and dependants, and prisoners.
Ratios produced for counts containing zero by age and sex are changed to one. Example of a ratio change calculation by quinary age and sex: Year 2 count of 226 divided by Year 1 count of 197 gives a change ratio of 1.1472, which is applied to the Year 1 population for the appropriate age and sex group.
11. Appendix E: Sub-threshold wards
In England and Wales, there were 7,624 electoral wards as at 31 December 2023. Best fitting of output areas to the 31 December 2023 set of electoral wards for City of London and Isles of Scilly provides estimates for some wards within these areas. However, there are 16 wards smaller than OAs and they will not have separate OA estimates attached to them. In these cases, neighbouring wards have assumed the populations.
Code of sub-threshold ward not assigned population estimates | Name of sub-threshold ward not assigned population estimates |
---|---|
E05009289 | Aldgate |
E05009290 | Bassishaw |
E05009293 | Bread Street |
E05009294 | Bridge |
E05009295 | Broad Street |
E05009296 | Candlewick |
E05009298 | Cheap |
E05009299 | Coleman Street |
E05009300 | Cordwainer |
E05009301 | Cornhill |
E05009303 | Dowgate |
E05009306 | Langbourn |
E05009307 | Lime Street |
E05009311 | Vintry |
E05009312 | Walbrook |
E05011091 | St Agnes |
Download this table Table 3: Sub-threshold wards
.xls .csv12. Cite this methodology
Office for National Statistics (ONS), revised 25 November 2024, ONS website, methodology, Methodology note on production of population estimates by output areas, electoral, health and other geographies, England and Wales