Читать книгу Real World Health Care Data Analysis - Uwe Siebert - Страница 8

Оглавление

Chapter 2: Causal Inference and Comparative Effectiveness: A Foundation

2.1 Introduction

2.2 Causation

2.3 From R.A. Fisher to Modern Causal Inference Analyses

2.3.1 Fisher’s Randomized Experiment

2.3.2 Neyman’s Potential Outcome Notation

2.3.3 Rubin’s Causal Model

2.3.4 Pearl’s Causal Model

2.4 Estimands

2.5 Totality of Evidence: Replication, Exploratory, and Sensitivity Analyses

2.6 Summary

References

2.1 Introduction

In this chapter, we introduce the basic concept of causation and the history and development of causal inference methods including two popular causal frameworks: Rubin’s Causal Model (RCM) and Pearl’s Causal Model (PCM). This includes the core assumptions necessary for standard causal inference analyses, a discussion of estimands, and directed acyclic graphs (DAGs). Lastly, we discuss the strength of evidence needed to justify inferring a causal relationship between an intervention and outcome of interest in non-randomized studies. The goal of this chapter is to provide the theoretical background behind the causal inference methods that are discussed and implemented in later chapters. Unlike the rest of the book, this is a theoretical discussion and lacks any SAS code or specific analytical methods. Reading this chapter is not necessary if your main interest is the application of the methods for inferring causation.

2.2 Causation

In health care research, it is often of interest to identify whether an intervention is “causally” related to a sequence of outcomes. For example, in a comparative effectiveness study, the objective is to assess whether a particular drug intervention is efficacious (for example, better disease control, improved patient satisfaction, superior tolerability, lower health care resource use or medical cost) for the target patient population in real world settings. Before defining causation, let us first point out the difference between causation and association (or correlation). For example, we have observed global warming for the past decade and during the same period the GDP of the United States increased an average of 2% per year. Are we able to claim global warming is the cause of US GDP increase, or vice versa? Not necessarily. The observation just indicates that global warming was present while the US GDP was increasing. Therefore, “global warming” and “US GDP increase” are two correlated or associated events, but there is little or no evidence suggesting direct causal relationship between them.

The discussion regarding the definition of “causation” has been ongoing for centuries among philosophers. We borrow the ideas from the 18th century Scottish philosopher David Hume to define causation: causation is the relation that holds between two temporally simultaneous or successive events when the first event (the

cause) brings about the other (the effect). According to Hume, when we say that “A causes B” (for example, fire causes smoke), we mean that

● A is “constantly conjoined” with B;

● B follows A and not vice versa;

● there is a “necessary connection” between A and B such that whenever an A occurs, a B must follow.

Here we present a hypothetical example to illustrate a “causal effect.” Assume that a subject has a choice to take drug A (T=1) or not (T=0), and the outcome of interest Y is a binary variable (1 = better, 0 = not better). There are four possible scenarios that we could observe. (See Table 2.1.)

Table 2.1: Possible Causal Effect Scenarios

1.The subject took A and got better.
T=1Y=1 (actual outcome)
2.The subject took A and did not get better.
T=1Y=0 (actual outcome)
3.The subject did not take A and got better.
T=0Y=1 (actual outcome)
4.The subject did not take A and did not get better.
T=0Y=0 (actual outcome)

If we observe any one of the scenarios in Table 2.1, can we claim a causal effect of drug A on outcome Y? That is, will taking treatment make the subject better or not better? The answer is “probably not,” even if we observe scenario 1 where the subject did get better after taking treatment A. Why? The subject might get better without taking drug A. Therefore, at an individual level, a causal relationship between the intervention (taking drug A) and an outcome cannot be established because we cannot observe the “counterfactual” outcome had the patient not taken such action.

If we were somehow able to know both the actual outcome of an intervention and the counterfactual outcome, that is, the outcome of the opposite, unobserved intervention (though in fact we are never able to observe the counterfactual outcome), then we could assess whether a causal effect exists between A and Y. Table 2.2 returns to the four possible scenarios in Table 2.1, but now with knowledge of both the outcome and the “counterfactual” outcome.

Table 2.2: Possible Causal Effect Scenarios

Unfortunately, in reality, we will not likely be able to observe both the outcome and its “counterfactual” simultaneously while keeping all other features of the subject unchanged. That is, we are not able to observe the “counterfactual” outcome on the same subject. This presents a critical challenge for assessing causal effect in research where causation is of interest. In summary, we might have to admit that understanding the causal relationship at the individual subject level is not attainable. Two approaches to address this issue are provided in Sections 2.3.3 and 2.3.4.

2.3 From R.A. Fisher to Modern Causal Inference Analyses

2.3.1 Fisher’s Randomized Experiment

For a long period of time, statisticians, even the great pioneers like Francis Galton and Karl Pearson, tended not to talk about causation but rather association or correlation (for example, Pearson’s correlation coefficient). Regression modeling was used as a tool to assess the association between a set of variables and the outcome of interest. The estimated regression coefficients sometimes were interpreted as causal effect (Yule, 1895, 1897, 1899), though such an interpretation could be misleading (Imbens and Rubin, 2015). Such confusion was not clarified until Sir Ronald Fisher brought clarity through the idea of a randomized experiment.

Fisher wrote a series of papers and books in the 1920s and 1930s (Fisher, 1922, 1930, 1936a, 1936b, 1937) on randomized experiments. Fisher stated that when comparing treatment effect between treatment and control groups, randomization could remove the systematic distortions that biased the causal treatment effect estimates. Note, the so-called “systematic distortions” could be either measured or unmeasured. With a perfect randomization, the control group will provide counterfactual outcomes for the observed performance in the treatment group, so that the causal effect can be estimated. So with randomization, a causal interpretation of the relationship between the treatment and the outcome is possible. Because of its ability to evaluate the causal treatment effect in a less biased manner, the concept of the randomized experiment was gradually accepted by researchers and regulators worldwide. Double-blinded, randomized clinical trials have become and remain the gold standard in seeking an approval of a human pharmaceutical product.

Randomized controlled trials (RCTs) remain at the top of the hierarchy of evidence largely because of their ability to generate causal interpretations for treatment effects. However, RCTs also have limitations:

1. it is not always possible to conduct an RCT due to ethical or practical constrains

2. they have great internal validity but often lack external validity (generalizability)

3. they are often not designed with sufficient power to study heterogeneous causal treatment effect (subgroup identification)

With the growing availability of large, real world health care data, there is a growing interest of non-randomized observational study for assessing the real world causal effects of interventions. Without randomization, proper assessment of causal effects is difficult. For example, in routine clinical practice, a group of patients receiving treatment A might be younger and healthier than another group of patients receiving treatment B, even if A and B have same target population and indication. Therefore, a direct comparison of the outcome between those two groups of patients could be biased because of the imbalances in important patient characteristics between the two groups. Variables that influence both the treatment choice and the outcome are confounders and their existence presents an important methodological challenge for estimating causal effect in non-randomized studies. So, what can one do? Fisher himself didn’t give an answer, but the idea of inferring causation through randomized experiment influenced the field of statistics and eventually lead to well-accepted causal frameworks, for example, a framework developed by Rubin and a framework developed by Pearl and Robins for inferring causation from non-randomized studies.

2.3.2 Neyman’s Potential Outcome Notation

Before formally introducing a causal framework, it is necessary to briefly review the notation of “potential outcomes.” Potential outcomes were first proposed by Neyman (1923) to explain causal effect in randomized experiments, but were not used elsewhere for decades before other statisticians realized their value in inferring causation in non-randomized studies.

Neyman’s notation begins as follows. Assume T=0 and T=1 are the two interventions or treatments for comparison, and Y is the outcome of interest. Every subject in the study has two potential outcomes: and That is, the two potential outcomes are the outcome had the subject taken treatment 1 and the outcome had the subject taken treatment 0. Therefore, for subjects i=1,…, n, there exists a vector of potential outcomes for each of the two different treatments, () and (). Given this notation, the causal effect is defined as difference in a statistic (mean difference, odds ratio, and so on) between the two potential outcome vectors. In the following sections, we introduce two established causal frameworks that have been commonly used in health care research: Rubin’s Causal Model and Pearl’s Causal Model.

2.3.3 Rubin’s Causal Model

Rubin’s Causal Model (RCM), was named by Holland (Holland, 1986) in recognition of the seminal work in this area conducted by Donald Rubin in 1970s and early 1980s (Rubin 1974, 1977, 1978, 1983). Below, we provide a brief description of the RCM and readers who are interested in learning more can read the numerous papers and books already written on this framework (Holland 1988, Little and Yau (1998), Angrist et al. (1996), Frangakis and Rubin (2002), Rubin (2004), Rubin (2005), Rosenbaum (2010), Rosenbaum (2017), Imbens and Rubin (2015)).

Using Neyman’s potential outcome notation, the individual causal treatment effect between two treatments T=0 and T=1 can be defined as:


Note, though we are able to define the individual causal treatment effect in theory, it is NOT estimable because we can only observe one potential outcome of the same subject while keeping other confounders unchanged. Instead, we can define other types of causal treatment effect that are estimable (“estimand”). For example, the average causal treatment effect (ATE),


where represents the potential outcome of th subject given different treatments, and represents the expectation.

In randomized experiments, estimating ATE is straightforward as the confounders are balanced between treatment groups. For non-randomized studies, under the RCM framework, the focus is to mimic randomization when randomization is actually not available. RCM emphasizes the equal importance on both the design and analysis stage of non-randomized studies. The idea of being “outcome free” at the design stage of a study before analysis is an important component of RCM. This means that the researchers should not have access to data on the outcome variables before they finalize all aspects of the design including ensuring that balance in distribution of potential baseline confounders between treatments can be achieved.

Since only pre-treatment confounders would be used in this design phase, this approach is similar to the “intention-to-treat” analysis in RCTs. RCM requires three key assumptions.

1. Stable Unit Treatment Value Assumption (SUTVA): the potential outcomes for any subject do not vary with the treatment assigned to other subjects, and, for each subject, there are no different forms or versions of each treatment level, which lead to different potential outcomes.

2. Positivity: the probability of assignment to either intervention for each subject is strictly between 0 and 1.

3. Unconfoundedness: the assignment to treatment for each subject is independent of the potential outcomes, given a set of pre-treatment covariates. In practice, it means all potential confounders should be observed in order to properly assess the causal effect.

If those assumptions hold in a non-randomized study, methods under RCM, such as propensity score-based methods, are able to provide unbiased estimate of the causal effect of the estimand of interest. We will further discuss estimand later in this chapter and provide case examples of the use of propensity score based methods in Chapters 4, 6, 7, and 8 later in this book.

2.3.4 Pearl’s Causal Model

In contrast to the RCM, Pearl advocates a different approach to interpret causation, which combines aspects of structure equations models and path diagrams (Halpern and Pearl 2005a, Halpern and Pearl 2005b, Pearl 2009a, Pearl 2009b). The direct acyclic graph (DAG) approach, which is part of the PCM, is another method commonly used in the field of epidemiology. Figure 2.1 presents a classical causal DAG, that is, a graph whose nodes (vertices) are random variables with directed edges (arrows) and no directed cycles. In situations described in this book, V denotes a (set of) measured pre-treatment patient characteristic(s) (or confounders), A the treatment/intervention of interest, Y the outcome of interest, and U a (set of) unmeasured confounders.

Figure 2.1: Example Directed Acyclic Graph (DAG)


Causal graphs are graphical models that are used to encode assumptions about the data-generating process. All common causes of any pair of variables in the DAG are also included in the DAG. In a DAG, the nodes correspond to random variables and the edges represent the relationships between random variable. The assumptions on the relation of the variables are encoded if there are no arrows. An arrow from node A to node Y may or may not be interpreted as a direct causal effect of A on Y. The absence of an arrow between U and A in the DAG means that U does not affect A. From the DAG, a series of conditional independence will then be induced, so that the joint distribution or probability of (L, A, Y) can be factorized as a series of conditional probabilities. Like RCM, PCM also has several key assumptions, in which the same SUTVA and positivity assumptions are included. For other more complicated assumptions like v-separation, we refer you to the literature of Pearl/Robins and their colleagues. (See above.)

The timing of obtaining the information about V, A, and Y can also be included in DAGs. Longitudinal data that may change over time are therefore shown as a sequence of data points as shown in Figure 2.2. Note: to be consistent with literature on causal inference with longitudinal data, we use L to represent time varying covariates and V for non-time varying covariates (thus is V in Figure 2.1). Time-dependent confounding occurs when a confounder (a variable that influences intervention and outcome) is also affected by the intervention (is an intermediate step on the path from intervention to outcome), as shown in Figure 2.2. In those cases, g-methods, such as inverse probability of treatment weighting (IPTW) (Chapter 11), need to be applied.

Figure 2.2: Example DAG with Time Varying Confounding


Causal graphs can be used to visualize and understand the data availability and data structure as well as to communicate data relations and correlations. DAGs are used to identify:

1. Potential biases

2. Variables that need to be adjusted for

3. Methods that need to be applied to obtain unbiased causal effects

Potential biases might be time-independent confounding, time-dependent confounding, unmeasured confounding, and controlling for a collider.

There are a few notable differences between RCM and PCM that deserve mention:

● PCM can provide understanding of the underlying data-generating system, that is, the relationship between confounders themselves and between confounders and outcomes, while the focus of RCM is on re-creating the balance in the distribution of confounders in non-randomized studies.

● The idea of “outcome-free” analysis is not applicable in PCM.

● The estimand under PCM does not apply to some types of estimands, for instance, the compliers average treatment effect.

2.4 Estimands

As stated before, the individual causal treatment effect is NOT estimable. Thus, we need to carefully consider the other types of causal effect that we would like to estimate, or the estimand. An estimand defines the causal effect of interest that corresponds a particular study objective, or simply speaking, what is to be estimated. In recent drafted ICH E9 Addendum (https://www.fda.gov/downloads/Drugs/ GuidanceComplianceRegulatoryInformation/Guidances/UCM582738.pdf), regulators clearly separate the concept of estimands and estimators. From the addendum, an estimand includes the following key attributes:

● The population, in other words, the patients targeted by the specific study objective

● The variable (or endpoint) to be obtained for each patient that is required to address the scientific question

● The specification of how to account for intercurrent events (events occurring after treatment initiation, such as concomitant treatment or medication switching, and so on) to reflect the scientific question of interest

● The population-level summary for the variable that provides, as required, a basis for a comparison between treatment conditions

Once the estimand of the study is specified, appropriate methods can then be selected. This is of particular importance in the study design stage because different methods may yield different causal interpretations. For example, if the study objective is to estimate the causal treatment effect of drug A versus drug B on the entire study population, then matching might not be appropriate because the matched population might not be representative of the original overall study population.

Below are a few examples of popular estimands, with ATE and ATT often used in comparative analysis of observational data in health care applications.

● Average treatment effect (ATE): ATE is a commonly used estimand in comparative observational research and is defined as the average difference in the pairs of potential outcomes, averaged over the entire population. The ATE can be interpreted as the difference in the outcome of interest had every subject taken treatment A versus had every subject taken treatment B.

● Average treatment effect of treated (ATT): Sometimes we are interested in the causal effect only among those who received one intervention of interest (“treated”). In this case the estimand is the average treatment effect of treated (ATT), which is the average difference of the pairs of potential outcomes, averaged over the “treated” population. ATT can be interpreted as the difference in the outcome had every treated subject been “treated,” versus the counterfactual outcomes had every “treated” subject taken the other intervention. Notice, in a randomized experiment, ATT is equivalent to ATE.

● Compliers’ average treatment effect (CATE): In RCTs or observational studies, there is an interest in understanding the causal treatment effect for those who complied with their assigned interventions (Frangakis and Rubin 2002). Such interest generates an estimate of the CATE as described below.

Regarding the CATE, let us first consider the scenario in a randomized experiment. In an intention-to-treat analysis, we compare individuals assigned to the treatment group (but who did not necessarily receive it) with individuals assigned to the control group (some of whom might have received the treatment). This comparison is valid due to the random assignment, but it does not necessarily produce an estimate of the effect of the treatment, rather it estimates the effect of assigning or prescribing a treatment. The instrumental variables estimator in this case adds an assumption and modifies the intention-to-treat estimator to an estimator of the effect of the treatment. The key assumption is that the assignment has no causal effect on the outcome except through a causal effect on the receipt of the treatment.

In general, we can think of there being four types of individuals characterized by the response to the treatment assignment. There are individuals who always receive the treatment, regardless of their assignment, the “always-takers.” There are individuals who never receive the treatment, regardless of their assignment, the “never-takers.” For both of these subpopulations, the key assumption is that there is no effect of the assignment whatsoever. Then there are individuals who will always comply with their assignment, the “compliers.” We typically rule out the presence of the fourth group, the “defiers” who do the opposite of their assignment. We can estimate the proportion of compliers (assuming no defiers) as the share of treated among those assigned to the treatment minus the share of treated among those assigned to the control group. The instrumental variables estimator is then the ratio of the intent-to-treat effect on the outcome divided by the estimated share of compliers. This has the interpretation of the average effect of the receipt of the treatment on the outcome for the subpopulation of the compliers, referred to as the “local average treatment effect” or the complier average treatment effect.

Beyond the setting of a completely randomized experiment with non-compliance where the assignment is the instrument, these methods can also be used in observational settings. For example, ease of access to medical services as measured by distance to medical facilities that provide such services has been used as an instrument for the effect of those services on health outcomes.

Note that in these descriptions – while commonly used in the comparative effectiveness literature – do not fully define the estimand, as they do not address the intercurrent event. However, it is possible to use the strategy proposed in the addendum to define estimand in observational studies when intercurrent events exist. For instance, we could define the hypothetical average treatment effect as the difference between the two counterfacturals assuming everybody takes treatment A versus everybody takes treatment B without intercurrent event.

2.5 Totality of Evidence: Replication, Exploratory, and Sensitivity Analyses

As briefly mentioned at the beginning of this chapter, it is a legitimate debate whether causation can be ascertained from empirical observations. The literature includes multiple examples of claims from observational studies that have been found not to be causal relationships (Ionetta 2005, Ryan et al. 2012, Hempkins et al. 2016 – though some have been refuted – Franklin et al. 2017). Unfortunately, unless we have a well-designed and executed randomized experiment where other possible causal interpretations can be ruled out, it is difficult to fully ensure that a causal interpretation is valid. Therefore, even after a comparative observational study using appropriate bias control analytical methods, it is natural to raise the following questions. “Can we believe the causation assessed from a single observational study? How much confidence should we place on the estimated causal effect? Is there any hidden bias not controlled for? Are there any critical assumptions that are violated?” Several of the guidance documents in Table 1.2 provide a structured high-level approach to understanding the quality level from a particular study and thus start to address these questions. Grimes and Schulz (2002) also summarized questions to ask to assess the validity of a causal finding from observational research including the temporal sequence, strength and consistency of the association, biological gradient and plausibility, and coherence with existing knowledge. To expand on these ideas, we introduce the concept of totality of evidence, which represents the strength of evidence that we used to make an opinion about causation. The totality of evidence should include the following elements:

● Replicability

● Implications from exploratory analysis

● Sensitivity analysis on the critical assumptions

First, let us discuss replicability. Figure 2.3 summarizes the well-accepted evidence paradigm in health care research.

Figure 2.3: Hierarchy of Evidence


Evidence generated from multiple RCTs is atop of the paradigm, followed by the single RCTs (Sackett et al. 1996, Masic et al. 2008). Similarly, for non-randomized studies, if we were able to conduct several studies for the same research question, for example, replicate the same study on different databases, then the evidence from all of those studies would be considered stronger than the evidence from any single observational study, as long as they were all reasonably designed and properly analyzed. Here is why. Assume the “false positive” chance of observing a causal effect in any study is 5%, and we only make the causal claim if all studies reflect a causal effect. If we have two studies, then the chance that both studies are “false positive” would be 5%*5%=0.25% (1 in 400). However, with a single study, the chance of false positive causal claim is 1 in 20. Thus, replication is an important component when justifying causal relationship.

However, as Vandenbroucke (2008) points out, proper replication in observational research is more challenging than for RCTs as challenges to conclusions from observational research are typically due to potential uncontrolled bias and not chance. For example, Zhang et al. (2016) described the setting of comparative research on osteoporosis treatments from claims data that was lacking bone mineral density values (an unmeasured confounder). Simply replicating this work in the same type of database with the same unmeasured confounder would not remove the concern with bias. Thus, replication that not only addresses the potential for chance findings but those involving different data or with different assumptions might be required.

The second element is implications from exploratory analysis and we will borrow the following example from Cochran (1972) for demonstration purposes.

For causes of death for which smoking is thought to be a leading contributor, we can compare death rates for nonsmokers and for smokers of different amounts, for ex-smokers who have stopped for different lengths of time but used to smoke the same amount, for ex-smokers who have stopped for the same length of time but used to smoke different amounts, and for smokers of filter and nonfilter cigarettes. We can do this separately for men and women and also for causes of death to which, for physiological reasons, smoking should not be a contributor. In each comparison the direction of the difference in death rates and a very rough guess at the relative size can be made from a causal hypothesis and can be put to the test.

Different from replicability, this approach follows the idea of “proof by contradiction.” That is, assuming there is causal relationship between the intervention and the outcome, what would be the possible consequences? If those consequences were not observed, then a causal relationship is questionable.

Lastly, each causal framework is based on assumptions. Therefore, the importance of sensitivity analysis should never be underestimated. The magnitude of bias induced by violating certain assumptions should be quantitatively assessed. For example, the Rosenbaum-Rubin sensitivity analysis (Rubin and Rosenbaum, 1983, JRSSB) was proposed to quantify the impact of a potential unmeasured confounder, though the idea could trace back to Cornfield et al. (1959). Sensitivity analyses should start with the assumptions made for a causal interpretation, such as positivity, unmeasured confounding and correct modeling. Sensitivity analysis to evaluate the impact of unmeasured confounders is discussed in more detail in Chapter 13 of this book. The DAGs discussed above can be used to assess the potential direction of bias due to unmeasured confounding. For assumptions that are not easily tested through quantitative methods (for example, SUTVA, positivity), researchers should give critical thinking at the design stage to ensure that these assumptions are reasonable in the given situation.

2.6 Summary

This chapter has provided an overview of the theoretical background for inferring causal relationship properly in non-randomized observational research. This background serves as the foundation of the statistical methodologies that will be used throughout the book. It includes an introduction of the potential outcome concept, the Rubin’s and Pearl causal frameworks, estimands, and the totality of evidence. For most chapters of this book, we follow Rubin’s causal framework. DAGs will be used to understand the relationships between interventions and outcomes, confounders and outcomes, as well as interventions and confounders, and to assess the causal effect if post-baseline confounding presents. Also critical is the understanding of the three core assumptions for causal inference under RCM and the necessity of conducting sensitivity analysis aligned with those assumptions for applied research.

References

Angrist JD, Imbens GW, Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91.434: 444-455.

Cochran WG (1972). Observational studies. In Bancroft TA (Ed.) (1972), Statistical papers in honor of George W. Snedecor (pp. 77-90). Ames, IA: Iowa State University Press. Reprinted in Observational Studies 1, 126–136.

Cornfield J et al. (1959) Smoking and lung cancer: recent evidence and a discussion of some questions. Journal of the National Cancer Institute 22.1: 173-203.

Fishe, RA (1936). Design of experiments. Br Med J 1.3923: 554-554.

Fisher RA (1922). On the interpretation of χ 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85.1: 87-94.

Fisher RA (1936). Has Mendel’s work been rediscovered? Annals of science 1.2: 115-137.

Fisher RA (1937). The design of experiments. Edinburgh; London: Oliver And Boyd..

Fisher RA, Wishart J (1930). The arrangement of field experiments and the statistical reduction of the results. No. 10. HM Stationery Office.

Frangakis CE, Rubin DB (2002). Principal stratification in causal inference. Biometrics 58.1: 21-29.

Franklin JM, Dejene S, Huybrechts KF, Wang SV, Kulldorff M, Rothman KJ (2017). A Bias in the Evaluation of Bias Comparing Randomized Trials with Nonexperimental Studies. Epidem Methods DOI 10.1515/em-2016-0018.

Grimes DA and Schulz KF (2002). Bias and Causal Associations in Observational Research. Lancet 359:248-252.

Halpern JY, Pearl J (2005). Causes and explanations: A structural-model approach -- Part I: Causes. British Journal of Philosophy of Science 56:843-887.

Halpern JY, Pearl J (2005). Causes and explanations: A structural-model approach -- Part II: Explanations. British Journal of Philosophy of Science 56:889-911.

Hemkins LG, Contopoulos-Ioannidis DG, Ioannidis JPA (2016). Agreement of Treatment Effects for Mortality from Routinely Collected Data and Subsequent Randomized Trials: Meta-Epidemiological Survey. BMJ 352:i493.

Holland PW (1986). Statistics and causal inference. Journal of the American statistical Association 81.396: 945-960.

Holland PW (1988). Causal inference, path analysis and recursive structural equations models. ETS Research Report Series 1988.1: i-50.

Imbens GW, Rubin DB (2015). Causal inference in statistics, social, and biomedical sciences. New York: Cambridge University Press.

Ionnidas JPA (2005). Why Most Published Research Findings are False. PLoS Med 2(8):696-701.

Little RJ, Yau LHY (1998). Statistical techniques for analyzing data from prevention trials: Treatment of no-shows using Rubin’s causal model. Psychological Methods 3.2: 147.

Masic I, Miokovic M, Muhamedagic B (2008). Evidence based medicine–new approaches and challenges. Acta Informatica Medica 16.4: 219.

Pearl J (2009). Causal inference in statistics: An overview. Statistics Surveys 3:96-146.

Pearl J (2009). Causality: Models, Reasoning and Inference. 2nd Edition. New York: Cambridge University Press.

Rosenbaum PR (2010). Design of observational studies. Vol. 10. New York: Springer.

Rosenbaum PR (2017). Observation and experiment: an introduction to causal inference. Boston: Harvard University Press.

Rosenbaum PR, Rubin DB (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Methodological) 45.2: 212-218.

Rosenbaum PR, Rubin DB (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70.1: 41-55.

Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology 66.5: 688.

Rubin DB (2004). Direct and indirect causal effects via potential outcomes. Scandinavian Journal of Statistics 31.2: 161-170.

Rubin DB (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100.469: 322-331.

Rubin DB (1978). Bayesian Inference for Causal Effects: The Role of Randomization. The Annals of Statistics, 6: 34–58.

Rubin DB (1977). Assignment of Treatment Group on the Basis of a Covariate. Journal of Educational Statistics, 2: 1–26.

Ryan PB, Madigan D, Stang PE, Overhage JM, Racoosin JA, Hartzema AG (2012). Empirical Assessment of Methods for Risk Identification in Healthcare Data: Results from the Experiments of the Observational Medical Outcomes Partnership. Stat in Med 31:4401-4415.

Sackett DL. et al. (1996). Evidence based medicine: what it is and what it isn’t. BMJ 312(7023): 71–72.

Vandenbroucke JP (2008). Observational Research, Randomised Trials, and Two Views of Medical Science. PLoS Med 5(3):339-343.

Yule, GU (1895). On the correlation of total pauperism with proportion of out-relief. The Economic Journal 5.20: 603-611.

Yule, GU (1897). On the theory of correlation. Journal of the Royal Statistical Society 60.4: 812-854.

Yule, GU (1899). An investigation into the causes of changes in pauperism in England, chiefly during the last two intercensal decades

(Part I.). Journal of the Royal Statistical Society 62.2: 249-295.

Zhang X, Faries DE, Boytsov N, et al. (2016). A Bayesian sensitivity analysis to evaluate the impact of unmeasured confounding with external data: a real world comparative effectiveness study in osteoporosis. Pharmacoepidemiology and drug safety 25(9):982-92.

Real World Health Care Data Analysis

Подняться наверх