Walters, SJ and Brazier, JE (2002): Sample sizes for the SF-6D preference based measure of health from the SF-36: a practical guide.
Download (227Kb) | Preview
Background Health Related Quality of Life (HRQoL) measures are becoming more frequently used in clinical trials and health services research, both as primary and secondary endpoints. Investigators are now asking statisticians for advice on how to plan and analyse studies using HRQoL measures, which includes questions on sample size. Sample size requirements are critically dependent on the aims of the study, the outcome measure and its summary measure, the effect size and the method of calculating the test statistic. The SF-6D is a new single summary preference-based measure of health derived from the SF-36 suitable for use clinical trials and in the economic evaluation of health technologies.
Objectives To describe and compare two methods of calculating sample sizes when using the SF-6D in comparative clinical trials and to give pragmatic guidance to researchers on what method to use.
Methods We describe two main methods of sample size estimation. The parametric (t-test) method assumes the SF-6D data is continuous and normally distributed and that the effect size is the difference between two means. The non-parametric (Mann-Whitney MW) method assumes the data are continuous and not normally distributed and the effect size is defined in terms of the probability that an observation drawn at random from population Y would exceed an observation drawn at random from population X. We used bootstrap computer simulation to compare the power of the two methods for detecting a shift in location.
Results This paper describes the SF-6D and retrospectively calculated parametric and nonparametric effect sizes for the SF-6D from a variety of studies that had previously used the SF-36. Computer simulation suggested that if the distribution of the SF-6D is reasonably symmetric then the t-test appears to be more powerful than the MW test at detecting differences in means. Therefore if the distribution of the SF-6D is symmetric or expected to be reasonably symmetric then parametric methods should be used for sample size calculations and analysis. If the distribution of the SF-6D is skewed then the MW test appears to be more powerful at detecting a location shift (difference in means) than the t-test. However, the differences in power (between the t and MW tests) are small and decrease as the sample size increases.
Conclusions We have provided a clear description of the distribution of the SF-6D and believe that the mean is an appropriate summary measure for the SF-6D when it is to be used in clinical trials and the economic evaluation of new health technologies. Therefore pragmatically we would recommend that parametric methods be used for sample size calculation and analysis when using the SF-6D.
|Item Type:||MPRA Paper|
|Original Title:||Sample sizes for the SF-6D preference based measure of health from the SF-36: a practical guide|
|Keywords:||sample size; health-related quality of life; SF-36; preference-based measures of health; bootstrap simulation|
|Subjects:||I - Health, Education, and Welfare > I3 - Welfare and Poverty > I31 - General Welfare
I - Health, Education, and Welfare > I1 - Health > I19 - Other
|Depositing User:||Sarah McEvoy|
|Date Deposited:||24. Mar 2011 21:50|
|Last Modified:||12. Feb 2013 11:07|
Altman DG, Machin D, Bryant TN, Gardner MJ (2000). Statistics with confidence. Confidence intervals and statistical guidelines. 2nd edition. London: British Medical Journal.
Machin D, Campbell MJ, Fayers PM, Pinol AJY (1997). Sample sizes tables for clinical studies. 2nd edition. Oxford: Blackwell Science.
Brazier J, Deverill M, Green C, Harper R, Booth A (1999). A review of the use of health status measures in economic evaluations. Health Technol Assess 3(9);1-164.
Williams A (1995). The measurement and valuation of health: a chronicle. Centre for Health Economics Discussion paper 136, University of York.
Feeny D, Furlong W, Boyle M, Torrance GW (1995). Multi-attribute health status classification systems. Health Utilities Index. Pharmacoeconomics 7:490-502.
Ware JE Jr, Sherbourne CD (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care 30:473-83.
Brazier JE, Harper R, Jones NMB, O’Cathain A, Thomas KJ, Usherwood T, Westlake L (1992). Validating the SF-36 health survey questionnaire: new outcome measure for primary care. British Medical Journal 305:160-4.
Ware JE Jr, Kosinski M, Keller SD (1994). SF-36 physical and mental health summary scales: a user’s manual. Health Institute, Boston.
Fayers PM, Machin DM (2000). Quality of life: assessment, analysis and interpretation. Chichester: Wiley.
Brazier J, Usherwood T, Harper R, Thomas K (1998). Deriving a preference-based single index from the UK SF-36 Health Survey. J Clin Epidemiol 51(11):1115-28.
Brazier JE, Roberts JF, Deverill MD (2002). The estimation of a preference based measure of health from the SF-36. Health Economics 21:271-92.
Matthews JNS, Altman DG, Campbell MJ, Royston P (1990). Analysis of serial measurements in medical research. British Medical Journal 300; 230-5.
Hogg RV, Tanis EA (1988). Probability and statistical inference. 3rd edition. New York: Macmillan.
Armitage P, Berry G, Matthews JNS (2002). Statistical methods in medical research. 4th edition. Oxford: Blackwell Science.
Drummond MF, Stoddard GL, Torrance GW (1997). Methods for the economic evaluation of health care programmes. 2nd edition, Oxford: Oxford University Press.
Drummond MF (2001). Introducing economic and quality of life measures into clinical studies. Ann Med 33:344-9.
Briggs AH, Mooney CZ, Wonderling DE (1999). Constructing confidence intervals for cost-effectiveness ratios: an evaluation of parametric and non-parametric techniques using Monte Carlo simulation. Statistics in Medicine 18:3245-62.
Willan AR, O’Brien BJ (1999). Sample size and power issues in estimating incremental cost-effectiveness ratios from clinical trials data. Health Economics 8(3):203-11.
O’Brien BJ, Drummond MF, Labelle RJ, Willan A (1994). In search of power and significance: issues in the design and analysis of stochastic cost-effectiveness studies in health care. Medical Care 32(2):150-63.
Pocock SJ (1983). Clinical trials: a practical approach. Chichester: Wiley.
Campbell MJ, Julious SA, Altman DG (1995). Estimating sample sizes for binary, ordered categorical, and continuous outcomes in two group comparisons. British Medical Journal 311: 1145-8.
Walters SJ, Morrell CJ, Dixon S (1999). Measuring health-related quality of life in patients with venous leg ulcers. Quality of Life Research 8(4):327-336.
Bland JM, Altman DG (1996). The use of transformation when comparing two means. British Medical Journal 312:1153.
Lehman EL (1975). Nonparametric statistical methods based on ranks. San Francisco: Holden-Day.
Noether GE (1987). Sample size determination for some common nonparametric tests. J American Statistical Association 82(398):645-7.
Collings BJ, Hamilton MA (1991). Determining the appropriate sample size for nonparametric tests for location shift. Technometrics 3(33):327-37.
Simonoff JS, Hochberg Y, Reiser B (1986). Alternative estimation procedures for Pr(X < Y) in categorised data. Biometrics 42:895-907.
Lesaffre E, Scheys I, Frohlich J, Bluhmki E (1993). Calculation of power and sample size with bounded outcome scores. Statistics in Medicine 12:1063-78.
Elashoff JD (1999). nQuery Advisor Version 3.0 User’s Guide. Los Angeles: Statistical Solutions.
Norman GR, Sridhar FG, Guyatt GH, Walter SD (2001). The relation of distribution- and anchor-based approaches in interpretation of changes in health related quality of life. Medical Care 39(10):1039-47.
Cohen J (1988). Statistical power analysis for the behavioural sciences. 2nd edition. New Jersey: Lawrence Earlbaum.
Jaeschke R, Singer J, Guyatt GH (1989). Measurement of health status: ascertaining the minimal clinically important difference. Controlled Clinical Trials 10:407-15.
Morrell CJ, Walters SJ, Dixon S, Collins KA, Brereton LML, Peters J, Brooker CGD (1998). Cost-effectiveness of community leg ulcer clinics: randomised controlled trial. British Medical Journal 316:1487-91.
Morrell CJ, Spiby H, Stewart P, Walters S, Morgan A (2000). Costs and effectiveness of community postnatal support workers: randomised controlled trial. British Medical Journal 321:593-8.
Walters SJ, Munro JF, Brazier JE (2001). Using the SF-36 with older adults: cross-sectional community based survey. Age & Ageing 30:337-43.
Akehurst RL, Brazier JE, Mathers N, Healy C, Kaltenthaler E, Morgan AM, Platts M, Walters SJ (2002). Health-related quality of life and cost impact of irritable bowel syndrome in a UK primary care setting. Pharmacoeconomics 20(7):455-62.
Brazier JE, Walters SJ, Nicholl JP, Kohler B (1996). Using the SF-36 and Euroqol on an elderly population. Quality of Life Research 5:195-204.
Harper R, Brazier JE, Waterhouse JC, Walters SJ, Jones NMB, Howard P (1997). Comparison of outcome measures for patients with chronic obstructive pulmonary disease (COPD) in an outpatient setting. Thorax 52:879-87.
Brazier JE, Harper R, Munro JF, Walters SJ, Snaith ML (1999). Generic and condition-specific outcome measures for people with osteoarthritis of the knee. Rheumatology 38:870-7.
Redelmeier D.A. Guyatt G.H., Goldstein R.S. Assessing the minimal important difference in symptoms: a comparison of two techniques. J. Clinical Epidemiology 1996; 49: 1215-1219.
Juniper EF, Guyatt GH, Feeny DH (1996). Measuring quality of life in children with asthma. Quality of Life Research 5:35-46.
Collings BJ, Hamilton MA (1998). Estimating the power of the two-sample Wilcoxon test for location shift. Biometrics 44:847-60.
Efron B, Tibshirani RJ (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Simon JL (2000). Resampling Stats: Users Guide. v5.02. Arlington: Resampling Stats Inc.
Thompson SG, Barber JA (2000). How should cost data in pragmatic randomised trials be analysed? British Medical Journal 320:1197-200.
Walters SJ, Campbell MJ, Lall R (2001). Design and analysis of trials with quality of life as an outcome: a practical guide. Journal of Biopharmaceutical Statistics 11(3):155-76.
Walters SJ, Campbell MJ, Paisley S (2001). Methods for determining sample sizes for studies involving quality of life measures: a tutorial. Health Services & Outcomes Research Methodology 2:83-99.
Williamson P, Hutton JL, Bliss J, Blunt J, Campbell MJ, Nicholson R (2000). Statistical review by research ethics committees. J Roy Statist Soc A 163:5-13.