Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 19

3.1 Bayesian Sparse Regression in the Age of Big N and Big P

Оглавление

With the goal of identifying a small subset of relevant features among a large number of potential candidates, sparse regression techniques have long featured in a range of statistical and data science applications [46]. Traditionally, such techniques were commonly applied in the “” setting, and correspondingly computational algorithms focused on this situation [47], especially within the Bayesian literature [48].

Due to a growing number of initiatives for large‐scale data collections and new types of scientific inquiries made possible by emerging technologies, however, increasingly common are datasets that are “big ” and “big ” at the same time. For example, modern observational studies using health‐care databases routinely involve patients and clinical covariates [49]. The UK Biobank provides brain imaging data on patients, with , depending on the scientific question of interests [50]. Single‐cell RNA sequencing can generate datasets with (the number of cells) in millions and (the number of genes) in tens of thousands, with the trend indicating further growths in data size to come [51].

Computational Statistics in Data Science

Подняться наверх