You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
309 lines
16 KiB
TeX
309 lines
16 KiB
TeX
\documentclass[../Main.tex]{subfiles}
|
|
\graphicspath{{\subfix{Assets/img/}}}
|
|
|
|
\begin{document}
|
|
|
|
% Begin by talking about goal, what does it mean? This might need some work prior to give more background.
|
|
As I am trying to separate strategic concerns
|
|
(the effect of a marginal treatment methodology)
|
|
and an operational concern
|
|
(the effect of a delay in closing enrollment),
|
|
we need to look at what confounds these effects and how we might measure them.
|
|
|
|
The primary effects one expects to see are that
|
|
\begin{enumerate}
|
|
\item Adding more drugs will make it harder to finish a trial as it is
|
|
more likely to be terminated due to concerns about profitabilty.
|
|
\item Adding more drugs will make it harder to recruit, slowing enrollment.
|
|
\item Enrollment challenges increase the likelihood that a trial will
|
|
terminate.
|
|
% Mentioned below
|
|
% \item A large population/market will tends to have more drugs to treat it
|
|
% because it is more profitable.
|
|
% \item A large population/market will make it easier to recruit,
|
|
% reducing the likelihood of a termination due to enrollment failure.
|
|
\end{enumerate}
|
|
|
|
There are a few fundamental issues that arise when trying to estimate
|
|
these effects.
|
|
The first is that the severity of the disease and the size of the population
|
|
who has that disease affects the ease of enrolling participants.
|
|
For example, a large population may make it easier to find enough participants
|
|
to achieve the required statistical discrimination between
|
|
control and treatment.
|
|
Second, for some diseases there exists an endogenous dynamic
|
|
between the treatments available for a disease and the
|
|
market size/population with that disease.
|
|
\authorcite{cerda_EndogenousInnovations_2007} proposes two mechanisms
|
|
that link the drugs on the market and market size.
|
|
The inverse is that for many chronic diseases with high mortality rates,
|
|
more drugs cause better survivability, increasing the size of those markets.
|
|
The third major confound is that the drugs on the market affect enrollment.
|
|
If there is a treatment already on the market, patients or their doctors
|
|
may be less inclined to participate in the trial, even if the current treatment
|
|
has severe downsides.
|
|
|
|
There are additional problems.
|
|
One is in that the disease being treated affects the
|
|
safety and efficacy profile that the drug will be held too.
|
|
For example, if a particular cancer is very deadly and does not respond well
|
|
to current treatments, Phase I trials will enroll patients with that cancer,
|
|
as opposed to the standard of enrolling healthy volunteers
|
|
\cite{commissioner_DrugDevelopment_2020}.
|
|
The trial is more likely to be terminated early if the drug is unsafe or has no
|
|
discernabile effect, therefore termination depends in part on a compound-disease
|
|
interaction.
|
|
Another challenge comes from the interaction between duration and termination;
|
|
in that if a trial terminates before closing enrollment for issues other
|
|
than enrollment, then the enrollment will still be low.
|
|
On the other hand, if enrollment is low, the trial might terminate.
|
|
These outcomes are indistinguishable in the data provided by the final
|
|
\url{ClinicalTrials.gov} dataset.
|
|
|
|
Finally, while conducting a trial, the safety and efficacy of a drug are driven by
|
|
fundamental pharmacokinetic properties of the compounds.
|
|
These are only imperfectly measured both prior to and during any given trial.
|
|
Previously measured safety and efficacy inform the decision to start the trial
|
|
in the first place while currently observed safety and efficiency results
|
|
help the sponsor judge whether or not to continue the trial.
|
|
|
|
Because running experiments on companies running clinical trials is not going
|
|
to happen anytime soon, causal identification depends on using an observational
|
|
approach and a structural causal model.
|
|
Because the data generating process for the clinical trials records is rather
|
|
straightforward, this is an ideal place to use
|
|
\authorcite{pearl_causality_2000}
|
|
Do-Calculus.
|
|
This process involves describing the data generating process in the form of
|
|
a directed acyclic graph, where the nodes represent different variables
|
|
within the causal model and the directed edges (arrows) represent
|
|
assumptions about which variables influence the other variables.
|
|
There are a few algorithms that then tell the researcher which of the
|
|
relationships will be confounded, which ones can be statistically estimated,
|
|
and provides some hypotheses that can be tested to ensure the model is
|
|
reasonably correct.
|
|
|
|
|
|
|
|
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
|
|
my proposed data generating process,
|
|
It revolves around the decisions made by the study sponsor,
|
|
who must decide whether to let a trial run to completion
|
|
or terminate the trial early.
|
|
While receiving updates regarding the status of the trial, they ask questions
|
|
such as:
|
|
\begin{itemize}
|
|
\item Do I need to terminate the trial due to safety incidents?
|
|
\item Does it appear that the drug is effective enough to achieve our
|
|
goals, justifying continuing the trial?
|
|
\item Are we recruiting enough participants to achive the statistical
|
|
results we need in the budget we have?
|
|
\item Does the current market conditions and expectations about returns on
|
|
investment justify the expenditures we are making?
|
|
\end{itemize}
|
|
When appropriate issues arise, the study sponsor terminates the trial, otherwise
|
|
it continues to completion.
|
|
|
|
\begin{figure}[H] %use [H] to fix the figure here.
|
|
\frame{
|
|
\scalebox{0.65}{
|
|
\tikzfig{../assets/tikzit/CausalGraph2}
|
|
}
|
|
}
|
|
\todo{check if this is the correct graph}
|
|
\caption{Graphical Causal Model}
|
|
|
|
% \small{Crimson boxes are the variables of interest,
|
|
% white boxes are unobserved, while the gray boxes will be controlled for.}
|
|
\label{Fig:CausalModel}
|
|
\end{figure}
|
|
|
|
|
|
% Constructing the model more explicitly
|
|
% - quickly describe each node and line.
|
|
\todo{I think I need to blend the data section in before this, to give some overall information on data.}
|
|
\todo{I may need to add some information on snapshots so that this makes sense.}
|
|
|
|
A quick summary of the nodes of the DAG, the exact representation in the data, and their impact:
|
|
\begin{itemize}
|
|
\item Main Interests (Crimson Boxes)
|
|
\begin{enumerate}
|
|
\item \texttt{Will Terminate?}:
|
|
If the final status of the trial was \textit{terminated}
|
|
or \textit{completed}.
|
|
\item \texttt{Enrollment Status}:
|
|
This describes the current enrollment status of the snapshot, e.g.
|
|
\texttt{Recruiting},
|
|
\texttt{Enrolling by invitation only},
|
|
or
|
|
\texttt{Active, not recruting}.
|
|
\item \texttt{Market Measures}:
|
|
Various measures of the number of alternate drugs on the market.
|
|
These are either the number of other drugs with the same active ingredient as the trial
|
|
(both generic and originators),
|
|
and those considered alternatives in various formularies published by the United States Pharmacopeia.
|
|
\end{enumerate}
|
|
\item Observed Confounders (Gray Boxes)
|
|
\begin{enumerate}
|
|
\item \texttt{Condition}:
|
|
The underlying condition, classified by IDC-10 group.
|
|
This impacts every other aspect of the model.
|
|
\item \texttt{Population (market size)}:
|
|
Multiple measures of the impact the disease.
|
|
These are measured by the DALY cost of the disease in countries that have a
|
|
High, High-Medium, Medium, Medium-Low, and Low development scores.
|
|
This data comes from the Institute for Health Metrics' Global Burden of Disease study.
|
|
\item \texttt{Elapsed Duration}:
|
|
A normalized measure of the time elapsed in the trial.
|
|
Comes from the original estimate of the trial's primary completion date and the registered start date.
|
|
I take the difference in days between these, and get the percentage of that time that has elapsed.
|
|
\item \texttt{Decision to Proceed with Phase III}:
|
|
If the compound development has progressed to Phase III.
|
|
This is included in the analysis by only including Phase III trials.
|
|
\end{enumerate}
|
|
\item Unobserved Confounders (White Boxes)
|
|
\begin{enumerate}
|
|
\item \texttt{Fundamental Efficacy and Safety}:
|
|
The underlying safety of the compound.
|
|
Cannot be observed, only estimated through scientific study.
|
|
\item \texttt{Previously observed Efficacy and Safety}:
|
|
The information gathered in previous studies.
|
|
This is not available in my dataset because I don't have links to prior studies.
|
|
\item \texttt{Currently observed Efficiency and Safety}:
|
|
The information gathered during this study.
|
|
This is only partially available, and so is treated as unavailable.
|
|
After a study is over, the investigators are supposed to publish information about adverse events.
|
|
\end{enumerate}
|
|
\end{itemize}
|
|
|
|
\begin{itemize}
|
|
\item Relationships of interest
|
|
\begin{enumerate}
|
|
\item \texttt{Enrollment Status} $\rightarrow$ \texttt{Will Terminate?}:
|
|
This is the primary effect of interest.
|
|
\item \texttt{Market Measures} $\rightarrow$ \texttt{Will Terminate?}:
|
|
This is the secondary effect of interest.
|
|
\end{enumerate}
|
|
\item Confounding Pathways
|
|
\begin{enumerate}
|
|
\item
|
|
\texttt{Condition}:
|
|
Affects every other node.
|
|
Part of the Adjustment Set.
|
|
\item Backdoor Pathway
|
|
between \texttt{Will Terminate?} and
|
|
\texttt{Enrollment Status} through safety and efficiency.
|
|
The concern is that since previously learned information
|
|
and current information are driven by the same underlying
|
|
physical reality, the enrollment process and
|
|
termination decisions may be correlated.
|
|
Controlling for the decision to proceed with the trial is the
|
|
best adjustment available to block this confounding pathway.
|
|
Below I describe the exact pathways.
|
|
\begin{enumerate}
|
|
\item
|
|
\texttt{Fundamental Efficacy and Safety}
|
|
$\rightarrow$
|
|
\texttt{Currently Observed Efficacy and Safety}:
|
|
This relationship represents the measurements of
|
|
safety and efficacy in the current trial.
|
|
\item
|
|
\texttt{Currently Observed Efficacy and Safety}:
|
|
$\rightarrow$
|
|
\texttt{Will Terminate?}:
|
|
This is how the measurements of safety and efficacy in the
|
|
current trial affect the probability of termination.
|
|
% typically, evidence of a lack safety or efficacy is
|
|
% enought to terminate the trial.
|
|
\item \texttt{Fundamental Efficacy and Safety}
|
|
$\rightarrow$
|
|
\texttt{Previously Observed Efficacy and Safety}:
|
|
This relationship represents the measurements of
|
|
safety and efficacy in work prior to the current trial.
|
|
\item
|
|
\texttt{Previously Observed Efficacy and Safety}:
|
|
$\rightarrow$
|
|
\texttt{Decision to proceed with Phase III}:
|
|
Previously observed data is essential to the FDA's
|
|
decision to allow a phase III trial.
|
|
\end{enumerate}
|
|
\item
|
|
Backdoor Pathway from \texttt{Market Status}
|
|
to \texttt{Enrollment}
|
|
through \texttt{Population}.
|
|
The concern with this pathway is that the rate of enrollment, and
|
|
thus the enrollment status, is affected by the Population with
|
|
the disease.
|
|
Additionally, there is a concern that the number of competitors
|
|
is driven by the total market size.
|
|
Thus adding Population to the adjustment set is necessary.
|
|
\begin{enumerate}
|
|
\item
|
|
\texttt{Population}
|
|
$\rightarrow$
|
|
\texttt{Enrollment Status}:
|
|
This is fairly straightforward.
|
|
How easy it is to enroll participants depends in part
|
|
on how many people have the disease.
|
|
\item
|
|
\texttt{Population}
|
|
$\rightarrow$
|
|
\texttt{Market Measures}:
|
|
This assumes that the population effect flows only one
|
|
direction, i.e. that a large population size increases
|
|
the likelihood of a large number of drugs.
|
|
%TODO: Think about this one a bit because it does mess
|
|
% with identification, particularly of market effects.
|
|
% these two are jointly determined per cerda 2007.
|
|
% If I can't justify separating them, then I'll need to
|
|
% merge population (market size) and market measures (drugs on market).
|
|
\end{enumerate}
|
|
\item
|
|
\texttt{Market Measures}
|
|
$\rightarrow$
|
|
\texttt{Enrollment Status}:
|
|
This confounds the estimation of the effect of
|
|
\texttt{Enrollment} on \texttt{Will Terminate?}, and
|
|
so \texttt{Market Measures} is part of the adjustment set.
|
|
\item
|
|
\texttt{Market Measures}
|
|
$\rightarrow$
|
|
\texttt{Decision to proceed with Phase III}:
|
|
The alternative treatments on the market will affect a sponsors'
|
|
decision to move forward with a Phase III trial.
|
|
This is controlled for by only working with trials that
|
|
successfully begin recruitment for a Phase III Trial.
|
|
\item
|
|
\texttt{Elapsed Duration}
|
|
$\rightarrow$
|
|
\texttt{Will Terminate?}:
|
|
The amount of time past helps drive the decision to continue
|
|
or terminate.
|
|
\item
|
|
\texttt{Enrollment Status}
|
|
$\leftrightarrow$
|
|
\texttt{Elapsed Duration}:
|
|
% This is jointly determined. and the weakest part of the causal identification without an accurate model of enrollment.
|
|
This is one of the weakest parts of the causal inference.
|
|
Without a well defined model of enrollment, we can't separate
|
|
the interaction between the enrollment status and the elapsed
|
|
duration.
|
|
For example, if enrollment is running slower than expected,
|
|
the trial may be terminated due to concerns that it will not
|
|
achive the primary objectives or that costs will exceed
|
|
the budget allocated to the project.
|
|
\item
|
|
\texttt{Decision to Proceed with Phase III}
|
|
$\rightarrow$
|
|
\texttt{Will Terminate?}:
|
|
%obviously required. Maybe remove from listing and graph?
|
|
This effect is fairly straightforward, in that
|
|
there is no possibility of a termination or completion
|
|
if the trial does not start.
|
|
This is here to block a backdoor pathway between
|
|
\texttt{Will Terminate?} and the enrollment status
|
|
through \texttt{Previously observed Safety and Efficacy}.
|
|
\end{enumerate}
|
|
\end{itemize}
|
|
\end{document}
|