Merge branch 'rewrite_section' of https://git.youainti.com/youainti/ClinicalTrialsPaper into rewrite_section

claude_rewrite
Will King 1 year ago
commit e2576b5fc1

@ -5,9 +5,9 @@ How do I begin work on stuff
- we can't trust what we are told
- terminations could be due to safety, strategic, or operational concerns.
- explaining confounding between
- population/market and enrollment.
-population/market and market conditions.
- market conditions and enrollment.
- population/market and enrollment.
- describe other confounders
- safety and effectiveness
- duration <--> enrollment/termination
@ -17,6 +17,7 @@ How do I begin work on stuff
- Introduce Do-Calculus
- DAG model
- What do I need to control for, in some form or other?
CURRENTLY HERE:
- Introduce Data
- Clinical Trial Progression
- AACT gives us information on

@ -3,6 +3,7 @@
\begin{document}
In the sections below, I examine each source of data, their key features,
how they match with the variables in the Structural Model DAG,
and describe applicable terminology (\cref{datasources}).
I then discuss how these sources were tied together (\cref{datalinks}) and
describe the specific data used in the analysis (\cref{dataintegration}).

@ -10,7 +10,22 @@ and an operational concern
(the effect of a delay in closing enrollment),
we need to look at what confounds these effects and how we might measure them.
There are a few fundamental issues.
The primary effects one expects to see are that
\begin{enumerate}
\item Adding more drugs will make it harder to finish a trial as it is
more likely to be terminated due to concerns about profitabilty.
\item Adding more drugs will make it harder to recruit, slowing enrollment.
\item Enrollment challenges increase the likelihood that a trial will
terminate.
% Mentioned below
% \item A large population/market will tends to have more drugs to treat it
% because it is more profitable.
% \item A large population/market will make it easier to recruit,
% reducing the likelihood of a termination due to enrollment failure.
\end{enumerate}
There are a few fundamental issues that arise when trying to estimate
these effects.
The first is that the severity of the disease and the size of the population
who has that disease affects the ease of enrolling participants.
For example, a large population may make it easier to find enough participants
@ -20,22 +35,61 @@ Second, for some diseases there exists an endogenous dynamic
between the treatments available for a disease and the
market size/population with that disease.
\authorcite{cerda_EndogenousInnovations_2007} proposes two mechanisms
that link drugs on the market and market size.
The first is that a large market will tends to have more drugs to treat it.
that link the drugs on the market and market size.
The inverse is that for many chronic diseases with high mortality rates,
more drugs cause better survivability, increasing the size of those markets.
The third major confound is that the drugs on the market affect enrollment.
If there is a treatment already on the market, patients or their doctors
may be less inclined to participate in the trial, even if the current treatment
has severe downsides.
There are additional problems.
One is in that the disease being treated affects the
safety and efficacy profile that the drug will be held too.
For example, if a particular cancer is very deadly and does not respond well
to current treatments, Phase I trials will enroll patients with that cancer,
as opposed to the standard of enrolling healthy volunteers
\cite{commissioner_DrugDevelopment_2020}.
The trial is more likely to be terminated early if the drug is unsafe or has no
discernabile effect, therefore termination depends in part on a compound-disease
interaction.
Another challenge comes from the interaction between duration and termination;
in that if a trial terminates before closing enrollment for issues other
than enrollment, then the enrollment will still be low.
On the other hand, if enrollment is low, the trial might terminate.
These outcomes are indistinguishable in the data provided by the final
\url{ClinicalTrials.gov} dataset.
%%%%% \/\/\/\/\/ OLD STUFF \/\/\/\/\/
Finally, while conducting a trial, the safety and efficacy of a drug are driven by
fundamental pharmacokinetic properties of the compounds.
These are only imperfectly measured both prior to and during any given trial.
Previously measured safety and efficacy inform the decision to start the trial
in the first place while currently observed safety and efficiency results
help the sponsor judge whether or not to continue the trial.
Because running experiments on companies running clinical trials is not going
to happen anytime soon, causal identification will depend on creating a
structural causal model.
to happen anytime soon, causal identification depends on using an observational
approach and a structural causal model.
Because the data generating process for the clinical trials records is rather
straightforward, this is an ideal place to use
\authorcite{pearl_causality_2000}
Do-Calculus.
This process involves describing the data generating process in the form of
a directed acyclic graph, where the nodes represent different variables
within the causal model and the directed edges (arrows) represent
assumptions about which variables influence the other variables.
There are a few algorithms that then tell the researcher which of the
relationships will be confounded, which ones can be statistically estimated,
and provides some hypotheses that can be tested to ensure the model is
reasonably correct.
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
the data generating model.
The proposed data generating model consists of a decision maker, the study
sponsor, who must decide whether to let a trial run to completion or terminate
the trial early.
my proposed data generating process,
It revolves around the decisions made by the study sponsor,
who must decide whether to let a trial run to completion
or terminate the trial early.
While receiving updates regarding the status of the trial, they ask questions
such as:
\begin{itemize}
@ -43,65 +97,193 @@ such as:
\item Does it appear that the drug is effective enough to achieve our
goals, justifying continuing the trial?
\item Are we recruiting enough participants to achive the statistical
results we need?
results we need in the budget we have?
\item Does the current market conditions and expectations about returns on
investment justify the expenditures we are making?
\end{itemize}
When appropriate, the study sponsor terminates the trial.
If there are not enough issues to terminate the trial, it continues until it
is completed.
While conducting a trial, the safety and efficacy of a drug are driven by
fundamental pharmacokinetic properties of the compounds.
These are only imperfectly measured both prior to and during any given trial.
Previously measured safety and efficacy inform the decision to start the trial
in the first place while currently observed safety and efficiency results
help the sponsor judge whether or not to continue the trial.
Of course, these decisions are both affected by the specific condition being
treated due to differences in the severity of the symptoms.
When a trial has been started, it comes time to recruit participancts.
Participants frequently depend on the advice of their physician when deciding
to join a trial or not.
As these physicians have a duty to seek their patients best interest; they, along
with their patients will evaluate if the previously observed safety and efficacy
results justify joining the trial over using current standard treatments.
Thus the current market conditions may affect the rate at which participants
enroll in the trial.
The enrollment of participants in a trial depends on a few other factors.
The condition or disease of interest and how it progresses will determine how long
recruitiment will be held open versus just an observation of treatment arms.
Aditionally, a trial that has already reached a high enough enrollment will often
close recruitment by switching to an "Active, not recruiting" stage to manage costs.
Finally, enrolling participants depends on how difficult it is to find people
who suffer from the condition of interest.
The preceeding issue of population size also affects the number of alternatives available.
When there are less people affected by the disease, the smaller market reduces
possible profitability, all else equal.
Thus the likelihood of companies paying the sunk costs to develop drugs for
these conditions may be lower.
Finally, the number of alternatives on the market may affect the return on
investment directly, causing a trial to terminate early if the return is
not high enough.
When appropriate issues arise, the study sponsor terminates the trial, otherwise
it continues to completion.
\begin{figure}[H] %use [H] to fix the figure here.
\scalebox{0.6}{\tikzfig{../assets/tikzit/CausalGraph2}}
\caption{Causal Model}
\frame{
\scalebox{0.65}{
\tikzfig{../assets/tikzit/CausalGraph2}
}
}
\caption{Graphical Causal Model}
% \small{Crimson boxes are the variables of interest,
% white boxes are unobserved, while the gray boxes will be controlled for.}
\label{Fig:CausalModel}
\end{figure}
%
By using Judea Pearl's do-calculus, I can show that by choosing an adjustment
set of the decision to condut a phase III trial, the condition of interest,
the current status of the trial, and the population size will casually
identify the direct effects of enrollment and market alternatives on the
probability of termination.
This is easily verified through the backdoor criterion, which states that
if every path between the exposure and outcome that starts with an arrow
flowing into the exposure is blocked by one of the values in the adjustment
set, then the effect of the exposure on outcome is causally identified
(\cite{pearl_causality_2000}).
It can be easily visually verified by the DAG on the graph that this is the case.
% Constructing the model more explicitly
% - quickly describe each node and line.
% TODO: double check which graphic to use.
A quick summary of the nodes of the DAG and their impact:
\begin{itemize}
\item Main Interests (Crimson Boxes)
\begin{enumerate}
\item \texttt{Will Terminate?}:
If the final status of the trial was \textit{terminated}
or \textit{completed}.
\item \texttt{Enrollment Status}:
Measure of whether enrollment is progressing.
\item \texttt{Market Measures}:
Various measures of the number of alternate drugs on the market.
\end{enumerate}
\item Observed Confounders (Gray Boxes)
\begin{enumerate}
\item \texttt{Condition}:
The underlying condition.
This impacts every other aspect of the model.
\item \texttt{Population (market size)}:
Multiple measures of the impact the disease has (in DALYs).
\item \texttt{Elapsed Duration}:
A normalized measure of the trial progression.
\item \texttt{Decision to Proceed with Phase III}:
If the compound development has progressed to Phase III.
\end{enumerate}
\item Unobserved Confounders (White Boxes)
\begin{enumerate}
\item \texttt{Fundamental Efficacy and Safety}:
The underlying safety of the compound.
Cannot be observed, only estimated through scientific study.
\item \texttt{Previously observed Efficacy and Safety}:
The information gathered in previous studies.
\item \texttt{Currently observed Efficiency and Safety}:
The information gathered during this study.
\end{enumerate}
\end{itemize}
\begin{itemize}
\item Relationships of interest
\begin{enumerate}
\item \texttt{Enrollment Status} $\rightarrow$ \texttt{Will Terminate?}:
This is the primary effect of interest.
\item \texttt{Market Measures} $\rightarrow$ \texttt{Will Terminate?}:
This is the secondary effect of interest.
\end{enumerate}
\item Confounding Pathways
\begin{enumerate}
\item
\texttt{Condition}:
Affects every other node.
Part of the Adjustment Set.
\item Backdoor Pathway
between \texttt{Will Terminate?} and
\texttt{Enrollment Status} through safety and efficiency.
The concern is that since previously learned information
and current information are driven by the same underlying
physical reality, the enrollment process and
termination decisions may be correlated.
Controlling for the decision to proceed with the trial is the
best adjustment available to block this confounding pathway.
Below I describe the exact pathways.
\begin{enumerate}
\item
\texttt{Fundamental Efficacy and Safety}
$\rightarrow$
\texttt{Currently Observed Efficacy and Safety}:
This relationship represents the measurements of
safety and efficacy in the current trial.
\item
\texttt{Currently Observed Efficacy and Safety}:
$\rightarrow$
\texttt{Will Terminate?}:
This is how the measurements of safety and efficacy in the
current trial affect the probability of termination.
% typically, evidence of a lack safety or efficacy is
% enought to terminate the trial.
\item \texttt{Fundamental Efficacy and Safety}
$\rightarrow$
\texttt{Previously Observed Efficacy and Safety}:
This relationship represents the measurements of
safety and efficacy in work prior to the current trial.
\item
\texttt{Previously Observed Efficacy and Safety}:
$\rightarrow$
\texttt{Decision to proceed with Phase III}:
Previously observed data is essential to the FDA's
decision to allow a phase III trial.
\end{enumerate}
\item
Backdoor Pathway from \texttt{Market Status}
to \texttt{Enrollment}
through \texttt{Population}.
The concern with this pathway is that the rate of enrollment, and
thus the enrollment status, is affected by the Population with
the disease.
Additionally, there is a concern that the number of competitors
is driven by the total market size.
Thus adding Population to the adjustment set is necessary.
\begin{enumerate}
\item
\texttt{Population}
$\rightarrow$
\texttt{Enrollment Status}:
This is fairly straightforward.
How easy it is to enroll participants depends in part
on how many people have the disease.
\item
\texttt{Population}
$\rightarrow$
\texttt{Market Measures}:
This assumes that the population effect flows only one
direction, i.e. that a large population size increases
the likelihood of a large number of drugs.
%TODO: Think about this one a bit because it does mess
% with identification, particularly of market effects.
% these two are jointly determined per cerda 2007.
% If I can't justify separating them, then I'll need to
% merge population (market size) and market measures (drugs on market).
\end{enumerate}
\item
\texttt{Market Measures}
$\rightarrow$
\texttt{Enrollment Status}:
This confounds the estimation of the effect of
\texttt{Enrollment} on \texttt{Will Terminate?}, and
so \texttt{Market Measures} is part of the adjustment set.
\item
\texttt{Market Measures}
$\rightarrow$
\texttt{Decision to proceed with Phase III}:
The alternative treatments on the market will affect a sponsors'
decision to move forward with a Phase III trial.
This is controlled for by only working with trials that
successfully begin recruitment for a Phase III Trial.
\item
\texttt{Elapsed Duration}
$\rightarrow$
\texttt{Will Terminate?}:
The amount of time past helps drive the decision to continue
or terminate.
\item
\texttt{Enrollment Status}
$\leftrightarrow$
\texttt{Elapsed Duration}:
% This is jointly determined. and the weakest part of the causal identification without an accurate model of enrollment.
This is one of the weakest parts of the causal inference.
Without a well defined model of enrollment, we can't separate
the interaction between the enrollment status and the elapsed
duration.
For example, if enrollment is running slower than expected,
the trial may be terminated due to concerns that it will not
achive the primary objectives or that costs will exceed
the budget allocated to the project.
\item
\texttt{Decision to Proceed with Phase III}
$\rightarrow$
\texttt{Will Terminate?}:
%obviously required. Maybe remove from listing and graph?
This effect is fairly straightforward, in that
there is no possibility of a termination or completion
if the trial does not start.
This is here to block a backdoor pathway between
\texttt{Will Terminate?} and the enrollment status
through \texttt{Previously observed Safety and Efficacy}.
\end{enumerate}
\end{itemize}
\end{document}

Loading…
Cancel
Save