Emura, Takeshi and Chen, Yi-Hau and Chen, Hsuan-Yu (2012): Survival prediction based on compound covariate under cox proportional hazard models. Published in: PLoS ONE No. 7(10): e47627 (2012)
Preview |
PDF
MPRA_paper_41149.pdf Download (415kB) | Preview |
Abstract
Survival prediction from a large number of covariates is a current focus of statistical and medical research. In this paper, we study a methodology known as the compound covariate prediction performed under univariate Cox proportional hazard models. We demonstrate via simulations and real data analysis that the compound covariate method generally competes well with ridge regression and Lasso methods, both already well-studied methods for predicting survival outcomes with a large number of covariates. Furthermore, we develop a refinement of the compound covariate method by incorporating likelihood information from multivariate Cox models. The new proposal is an adaptive method that borrows information contained in both the univariate and multivariate Cox regression estimators. We show that the new proposal has a theoretical justification from a statistical large sample theory and is naturally interpreted as a shrinkage-type estimator, a popular class of estimators in statistical literature. Two datasets, the primary biliary cirrhosis of the liver data and the non-small-cell lung cancer data, are used for illustration. The proposed method is implemented in R package “compound.Cox” available in CRAN at http://cran.r-project.org/.
Item Type: | MPRA Paper |
---|---|
Original Title: | Survival prediction based on compound covariate under cox proportional hazard models |
English Title: | Survival Prediction Based on Compound Covariate under Cox Proportional Hazard Models |
Language: | English |
Keywords: | Cox proportional hazard model, Prediction, Survival analysis |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C13 - Estimation: General C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C14 - Semiparametric and Nonparametric Methods: General C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C34 - Truncated and Censored Models ; Switching Regression Models C - Mathematical and Quantitative Methods > C2 - Single Equation Models ; Single Variables > C24 - Truncated and Censored Models ; Switching Regression Models ; Threshold Regression Models C - Mathematical and Quantitative Methods > C4 - Econometric and Statistical Methods: Special Topics |
Item ID: | 41149 |
Depositing User: | takeshi emura |
Date Deposited: | 14 Jan 2013 18:24 |
Last Modified: | 27 Sep 2019 14:13 |
References: | 1.Jenssen TK, Kuo WP, Stokke T, Hovig E (2002) Association between gene expressions in breast cancer and patient survival. Human Genetics 111: 411–420. 2.van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AAM, et al. (2002) A gene-expression signature as a predictor of survival in breast cancer. N. Eng. J. Med 347: 1999–2009. 3.van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, et al. (2002) Gene expression profile predicts clinical outcome of breast cancer. Nature 415: 530–536. 4.Zhao X, Rodland EA, Sorlie T, Naume B, Langerod A, et al. (2011) Combining gene signatures improves prediction of breast cancer survival. PloS ONE 6(3): e17845. 5.Beer DG, Kardia SLR, Huang CC, Giordano TJ, Levin AM, et al. (2002) Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine 8: 816–824. 6.Chen HY, Yu SL, Chen CH, Chang GC, Chen CY, et al. (2007) A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med 356: 11–20. 7.Shedden K, Taylor JMG, Enkemann SA, Tsao MS, Yeatman TJ, et al. (2008) Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nature Medicine 14: 822–827. 8.Cox DR (1972) Regression models and life-tables (with discussion). Journal of the Royal Statistical Society, Series B 34: 187–220. 9.Brazma A, Culhane AC (2005) Algorithms for gene expression analysis. In: Dunn JM, Jorde LB, Little PFR, Subramaniam S, editors. Encyclopedia of Genetis, Genomics, Proteomics and Bioinformatics. London: John Wiley and Sons. 10.Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat. in Med. 16: 385–395. 11.Gui J, Li H (2005) Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21: 3001–3008. 12.Segal M (2006) Microarray gene expression data with linked survival phenotypes: diffuse large B-cell lymphoma revised. Biostatistics 7: 268–285. 13.Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12: 55–67. 14.Verveij PJM, van Houwelingen HC (1994) Penalized likelihood in Cox regression. Stat. in Med. 13: 2427–2436. 15.Radmacher MD, Mcshane LM, Simon R (2002) A paradigm for class prediction using gene expression profiles. Journal of Computational Biology 9: 505–511. 16.Matsui S (2006) Predicting survival outcomes using subsets of significant genes in prognostic marker studies with microarrays. BMC Bioinformatics 7: 156. 17.Bovelstad HM, Nygard S, Storvold HL, Aldrin M, Borgan O, et al. (2007) Predicting survival from microarray data – a comparative study. Bioinformatics 23: 2080–2087. 18.van Wieringen WN, Kun D, Hampel R, Boulesteix AL (2009) Survival prediction using gene expression data: A review and comparison. Comp. Stat. & Data Anal. 53, 1590–1603. 19.Bovelstad HM, Borgan O (2011) Assessment of evaluation criteria for survival prediction from genomic data. Biometrical Journal 53: 202–216. 20.Verveij PJM, van Houwelingen HC (1993) Crossvalidation in survival analysis. Stat. in Med. 12: 2305–2314. 21.Goeman JJ (2010) L1 penalized estimation in the Cox proportional hazards model. Biometrical Journal 52: 70–84. 22.Witten DM, Tibshirani R (2010) Survival analysis with high-dimensional covariates. Stat. Meth. in Med. Res. 19: 29–51. 23.Tukey JW (1993) Tightening the clinical trial. Controlled Clinical Trials 14: 266–285. 24.Tibshirani R (2009) Univariate shrinkage in the Cox model for high dimensional data. Statistical Applications in Genetics and Molecular Biology 8: 1–21. 25.Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA (1982) Evaluating the yield of medical tests. Journal of the American Medical Association 247: 2543–2546. 26.Harrell FE, Lee KL, Mark DB (1996) Multivariate prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. in Med. 15: 361–387. 27.Kraft P, Hunter DJ (2009) Genetic risk prediction–Are we there yet? N Engl J Med 360: 1701–1703. 28.Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical Models Based on Counting Processes. New York: Springer-Verlag. 29.Struthers CA, Kalbfleish JD (1986) Misspecified proportional hazard models. Biometrika 73: 363–369. 30.Bretagnolle J, Huber-Carol C (1988) Effects of omitting covariates in Cox’s model for survival data. Scandinavian Journal of Statistics 15: 125–138. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/41149 |