Skip to content

National Statistics Methodology Advisory Committee 8th meeting 4 May 2005

Eighth meeting, 4 May 2005

The following work-in-progress papers were considered:

Paper 1: Student Income and Expenditure Survey (SIES): Progress Report

This paper updates on a paper presented to the fifth NSMAC meeting. The SIES collects important information on income and expenditure of higher education students in England and Wales. The paper presents details of the sampling methodology (basically two stage sampling, of higher/further education institutions and then higher education students within them) and progress on the 2004/5 survey.

Committee conclusions:

  • The issues raised when the paper had first been presented to the NSMAC had been addressed well

  • Although requested by the institutions themselves, creating the sampling frame at the start of the academic year meant student numbers wouldn't have settled (because of early dropouts)

  • Modelling or calibration were suggested to deal with possible bias due to: the sampling frame excluding small institutions, and differential non-response (related to the institution size)

  • The NSMAC suggested sending both guidance and a guide to help the institutions select members from their list

Paper 2: Estimation of Disclosure Risk in Sample Microdata Using Probabilistic Modelling

The risk of re-identification of individuals/businesses is a key factor in deciding whether microdata can be released. Quantifying this risk when the population base is unknown (or partially unknown) is a complex problem. This paper presents the early results of a research programme aiming to solve this problem through probabilistic modelling. Associated issues, such as robust goodness of fit criteria and complex survey designs, are also discussed.

Committee conclusions:

  • In the simulations, log-linear models with two-way interactions seemed to work best with the Poisson Risk Model and the more general Negative Binomial Risk Model (however, there are estimation problems owing to large contingency tables when implementing log-linear modelling)

  • The number/proportion of population uniques should also be considered as risk measures

  • Disclosure risk assessment is an iterative process balancing between the application of SDC methods and managing the disclosure risk to tolerable levels

  • Confidence intervals were needed for the estimates of the risk measures

  • Reducing the size of the key (through recoding) would solve the partitioning problem, and the 100 age variables could be reduced by collapsing over invariant age ranges (for example, 80 - 90) or using a smooth functional form for age

  • Risk thresholds for release of microdata should be determined by assessing and benchmarking to previous cases where microdata has been released

Paper 3: Using time series methods to produce early LFS estimates

 
The Labour Force Survey produces key UK labour market statistics which influence local and central government economic policy. The time lag between the end of data collection and publication of estimates is currently six weeks. This paper presents forecast methodology which enables preliminary estimates (broken down by sex and employment/unemployment/inactivity) to be published four weeks earlier. The performance of the methodology is assessed using historical data.

Committee conclusions:

  • This paper on forecasting was welcomed by the NSMAC

  • Some improvements to the diagnostics used in the modelling process were suggested

  • Joint modelling was also suggested as, when viewed as a series of cross-equation restrictions, it was simpler than constraining to population, and it could pick up important divergences missed in current results

  • It was noted that weights needed to be forecast as well and it was suggested that we could try doing this by forecasting a times series constructed from global linear weights

Paper 4: Methodological Developments to Support Allsopp's Objectives for Economic Statistics

In 2004, Christopher Allsopp (of the Monetary Policy Committee) published his 'Review of Statistics for Economic Policymaking'. The challenges from the review for the Office for National Statistics (ONS) are to produce better economic estimates at regional level and to reflect the changing structure of the economy more effectively. This paper outlines how these challenges will be met by re-engineering the Inter-Departmental Business Register (IDBR) and redesigning and integrating some main business surveys.

Committee conclusions:

  • ONS should be prepared to make some outputs worse (not everything needs to get better), but based on a thorough understanding of user requirements

  • Bias/variance trade-off is key to this paper, but robustness of estimates to IDBR error, and IDBR coverage error, also need to be considered

  • The NSMAC would like to see performance of multivariate allocation, but noted that calibration too far is just noise

  • A better integrated business survey system was in demand, and could be used for microsimulation of business performance (and commercialised as at Statistics Canada)

  • The choice of updating source should minimise MSE. Updating frequency depends on its effects and rotation (annual updates bias change, real-time updates will bias levels). Modelling could help update non-surveyed IDBR data (or even all the IDBR), dependent on MSE impact. To identify the effect of updating, parallel run key outputs (with different treatment of births and deaths)

  • If deaths are high when births are low, then current treatment introduces bias. Two solutions are: use the ONS birth adjustment method, or use the state of the economy to model unrecorded births.
    The choice between design or domain based estimation depends on costs/benefits, and user requirements

  • Both local level raw data and small area estimation should be used (although the latter was difficult - possibly disclosive for concentrated businesses - and should be accompanied by quality indicators).
    Administrative data should supplement survey data, not replace them. This should be made clear to users

  • Systems should be built to take advantage of future data linking opportunities, and meet changing user needs

  • IDBR disclosure risks at low levels could be dealt with by local user QA under closed conditions

Content from the Office for National Statistics.
© Crown Copyright applies unless otherwise stated.