Skip to content

National Statistics Methodology Advisory Committee 7th meeting 6-7 October 2004

Seventh meeting, 6-7 October 2004

The following work-in-progress papers were considered:

Paper 1: Estimating Variance of Movements for Time Series Data: Issues Arising for Average Earnings Index - an update

This paper updates on the paper presented to the fourth NSMAC meeting. The original issues are revised and developments since Sept 2001 are listed - including a new method for estimating AEI variances, developments in estimation (and imputation) of covariances over time, and progress on other NSMAC recommendations. Comparisons are also presented - of AEI variance estimation methods, and variance/covariance imputation methods. Future work is discussed.

Committee conclusions:

  • Replacing Taylor Series linearisation was sensible, and the rationale of the proposed method was approved

  • However, the simulation study needed to be extended to other types of strata

  • Good quality measures were needed to rigorously evaluate what seemed to be very sensible numbers

  • Verification of coverage probabilities of simulated yearly change was essential. Simulated standard errors of the variance estimate were suggested

Paper 2: Update on the methodology for the Average Earnings Ratio

This paper udpates on the paper presented to the fifth NSMAC meeting. The focus is on the changes in estimation methodology, comparisons of the AER growth rate with that of the AEI, and initial work on Winsorisation for extreme values.

The AER is now known as Average Weekly Earnings (AWE).

Committee conclusions:

  • AEI and AER samples/sub-samples should be compared, and the response effect simulated to assess the impact of differences in non-response

  • Using historic series, test whether IDBR errors smoothed AER weights. Smoothed weights were actually a middle ground between the AEI and the AER, but could obscure real shocks of interest. Error in weights would only be a problem if covariance existed

  • For outliers, try: linear transition of weights for companies in and out of the sample, removing all large outliers, restratifying to dampen out outlier effects

  • Differences between the AEI and AER were complex

Paper 3: Area random effects model-based estimation of unemployment at local authority level

Unemployment is a key indicator of the social and economic health of an area, and is used for policy making and resource allocation. There is a growing demand for unemployment estimates at the small area level, which traditional survey sources are unable to meet. In response, the ONS developed model-based methodology and used this to publish experimental estimates for 1995/96 to 1999/2000. This paper describes an enhancement to the original method, and demonstrates the improvements.

Committee conclusions:

  • The null hypothesis should be rejected, as all correlations were positive

  • Model suggestions: simpler models, a rurality variable, a spatial correlation model, smoothing over time, super regions not Government Office Regions (GORs), piecewise constants rather than polynomials, shrinkage methods rather than model selection

  • Estimate standard errors using a semi-parametric bootstrap from the estimated residuals

  • Diagnostic suggestions: Q-Q plots to check Normality, a plot of residuals Vs fitted values to check for curves, profile likelihood to assess significance

Paper 4: Modelling Households on Low Income by Ward

Following the decision not to include a census question in the 2001 Census, the Office for National Statistics (ONS) ran a project to produce model-based estimates of average weekly household income for all wards in England and Wales. These estimates were published as experimental statistics on the Neighbourhood Statistics website. One disadvantage of these estimates for users is that using summary statistics can hide extreme wealth and poverty. This paper reports on investigations into modelling other measures of income, such as the median, distributions of income, or even proportions of households above/below set levels.

Committee conclusions:

  • Care was needed modelling when areas have few households

  • Model suggestions: a cross-classified model with three levels, random coefficient models to discern whether variation between postcode sectors was a function of ward characteristics, dynamic modelling (or an ordered logit model as an approximation) rather than a binary threshold for poverty, an MCMC algorithm (or quasi-likelihood function) to account for monotonicity in the distribution function

  • Estimation suggestions: a first order Taylor series rather than the anti-logit, bootstrapping or MCMC for confidence intervals, simulation of the random effects distribution to estimate the average

Paper 5: Model-Based Estimation of Income: Measuring Change Over Time

After producing model-based estimates of income for England and Wales wards for 1998/99, ONS recently repeated these estimates (using more up-to-date sources of data) for 2001/02. Before these new estimates can be published however, the issue of comparability of estimates over time needs to be investigated. This paper discusses current guidance and the more general issue of estimation of change over time.

Committee conclusions:

  • Try adding random effects (spatial and temporal) to the current International Labour Organisation (ILO) unemployment model

  • A household panel survey may be better for modelling income

  • Change over time suggestions: for a stable model include (or impute) all covariates ever significant, the repeated measures model requires overlapping wards, build up change from individual level data/modelling and calibrate to levels

Paper 6: NS 2001 Area Classifications

This paper updates a paper presented at the fifth NSMAC meeting. The paper describes what has happened to the released estimates since the original paper, and introduces a demonstration of Scalable Vector Graphics given at the meeting.

Committee conclusions:

  • An option to have scales on the bar charts was suggested

Paper 7: The 2001 Local Authority Population Studies and 2001 Manchester and Westminster Matching Studies

This paper reviews investigations into the performance of the 2001 Census, including the two high profile studies in Manchester and Westminster. These studies resulted in revisions being published for 15 local authorities (including Manchester and Westminster), totalling around 107,000 people. Emerging issues for the 2011 Census are also discussed.

Committee conclusions:

  • The NSMAC considered the methodology an improvement on 1991

  • Suggestions: an ONS address list as the long-term solution to possible overcount bias in council lists, allocating interviewers across areas to allow estimation of interviewer effects

  • Community ownership was recommended to get data/input from users before the census

Paper 8: A Statistical Framework for the 2011 Census

This paper identifies the statistical issues emerging from the 2001 Census and outlines plans for their resolution. The focus is on the key statistical aims and principles, and high priority areas of research - to frame statistical research and development over the coming years.

Committee conclusions:

  • Plans were thought to be on the right track, and needed to build on past evaluations/lessons. Cost/benefit analysis should be their driver

  • Assumptions that needed justifying: postal data collection would work in inaccessible areas, earlier processing was better than a fixed deadline, better questionnaires lead to less underenumeration, accurate administrative data will be available on time, census and social survey non-response is the same

  • Research should focus on: a gold standard address list, getting the count right, making mixed enumeration work, post-enumeration leading to robust estimates of enumeration errors at small area level, optimal targeting of the Census Coverage Survey (or sampling from administrative files) to minimise non-response and differential non-response, testing changes from 2001, better partnerships with LADs, different enumeration and collection strategies, using 1971 to 2001 data to study non-response

  • Parallel running of 2001 and 2011 Census methodology was recommended to measure the success of changes

  • More effort in reducing enumeration errors was urged. Other suggestions: improve questionnaire handling, get rough estimates out quicker, ensure sign-on from ethnic minorities, compare numbers returned to numbers despatched

Paper 9: Disclosure Control for Small Area Data

This paper examines the scope for a pre-tabular method of disclosure contol for small area data - with particular reference to the 2011 Census in Scotland.

Committee conclusions:

  • This issue was particularly important for the Census

  • Detection of protected records is possible if included in too many related tables, and disclosure methods need to take account of increases in computer power

  • For the 2011 Census, the disclosure method should be supported by software and used in all three UK Censuses, with balanced utility and risk, possibly limiting damage to data by changing both numerators and denominators

  • The 'PRAM' method was difficult to use, but was less damaging at small areas than geographic record swapping. Combining the two, to user requirements, was complex but required less perturbation.
    The two options for release were to analyse data in a safe setting, or perturb (taking account of information loss) and analyse remotely

  • The Census departments were invited to consider the committee's suggestions in future disclosure control for the 2011 Census

Content from the Office for National Statistics.
© Crown Copyright applies unless otherwise stated.