|
|
|
@ -10,9 +10,10 @@ and an operational concern
|
|
|
|
(the effect of a delay in closing enrollment),
|
|
|
|
(the effect of a delay in closing enrollment),
|
|
|
|
we need to look at what confounds these effects and how we might measure them.
|
|
|
|
we need to look at what confounds these effects and how we might measure them.
|
|
|
|
|
|
|
|
|
|
|
|
The primary effects one expects to see are that
|
|
|
|
The primary effects one might expect to see are that
|
|
|
|
\begin{enumerate}
|
|
|
|
\begin{enumerate}
|
|
|
|
\item Adding more drugs will make it harder to finish a trial as it is
|
|
|
|
\item Adding more drugs to the market will make it harder to
|
|
|
|
|
|
|
|
finish a trial as it is
|
|
|
|
more likely to be terminated due to concerns about profitabilty.
|
|
|
|
more likely to be terminated due to concerns about profitabilty.
|
|
|
|
\item Adding more drugs will make it harder to recruit, slowing enrollment.
|
|
|
|
\item Adding more drugs will make it harder to recruit, slowing enrollment.
|
|
|
|
\item Enrollment challenges increase the likelihood that a trial will
|
|
|
|
\item Enrollment challenges increase the likelihood that a trial will
|
|
|
|
@ -45,11 +46,11 @@ has severe downsides.
|
|
|
|
|
|
|
|
|
|
|
|
There are additional problems.
|
|
|
|
There are additional problems.
|
|
|
|
One is in that the disease being treated affects the
|
|
|
|
One is in that the disease being treated affects the
|
|
|
|
safety and efficacy profile that the drug will be held too.
|
|
|
|
safety and efficacy standards that the drug will be held too.
|
|
|
|
For example, if a particular cancer is very deadly and does not respond well
|
|
|
|
For example, if a particular cancer is very deadly and does not respond well
|
|
|
|
to current treatments, Phase I trials will enroll patients with that cancer,
|
|
|
|
to current treatments, Phase I trials will enroll patients with that cancer,
|
|
|
|
as opposed to the standard of enrolling healthy volunteers
|
|
|
|
as opposed to the standard of enrolling healthy volunteers
|
|
|
|
\cite{commissioner_DrugDevelopment_2020}.
|
|
|
|
\cite{commissioner_DrugDevelopment_2020} to establish safe dosages.
|
|
|
|
The trial is more likely to be terminated early if the drug is unsafe or has no
|
|
|
|
The trial is more likely to be terminated early if the drug is unsafe or has no
|
|
|
|
discernabile effect, therefore termination depends in part on a compound-disease
|
|
|
|
discernabile effect, therefore termination depends in part on a compound-disease
|
|
|
|
interaction.
|
|
|
|
interaction.
|
|
|
|
@ -67,9 +68,70 @@ Previously measured safety and efficacy inform the decision to start the trial
|
|
|
|
in the first place while currently observed safety and efficiency results
|
|
|
|
in the first place while currently observed safety and efficiency results
|
|
|
|
help the sponsor judge whether or not to continue the trial.
|
|
|
|
help the sponsor judge whether or not to continue the trial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
|
|
|
\subsection{Clinical Trials Data Sources}
|
|
|
|
|
|
|
|
%% Describe data here
|
|
|
|
|
|
|
|
Since Sep 27th, 2007 those who conduct clinical trials of FDA controlled
|
|
|
|
|
|
|
|
drugs or devices on human subjects must register
|
|
|
|
|
|
|
|
their trial at \url{ClinicalTrials.gov}
|
|
|
|
|
|
|
|
(\cite{noauthor_fdaaa_nodate}).
|
|
|
|
|
|
|
|
This involves submitting information on the expected enrollment and duration of
|
|
|
|
|
|
|
|
trials, drugs or devices that will be used, treatment protocols and study arms,
|
|
|
|
|
|
|
|
as well as contact information the trial sponsor and treatment sites.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When starting a new trial, the required information must be submitted
|
|
|
|
|
|
|
|
``\dots not later than 21 calendar days after enrolling the first human subject\dots''.
|
|
|
|
|
|
|
|
After the initial submission, the data is briefly reviewed for quality and
|
|
|
|
|
|
|
|
then the trial record is published and the trial is assigned a
|
|
|
|
|
|
|
|
National Clinical Trial (NCT) identifier.
|
|
|
|
|
|
|
|
\cite{noauthor_fdaaa_nodate}.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Each trial's record is updated periodically, including a final update that must occur
|
|
|
|
|
|
|
|
within a year of completing the primary objective, although exceptions are
|
|
|
|
|
|
|
|
available for trials related to drug approvals or for trials with secondary
|
|
|
|
|
|
|
|
objectives that require further observation\footnote{This rule came into effect in 2017}
|
|
|
|
|
|
|
|
\cite{noauthor_fdaaa_nodate}.
|
|
|
|
|
|
|
|
Other than the requirements for the the first and last submissions, all other
|
|
|
|
|
|
|
|
updates occur at the discresion of the trial sponsor.
|
|
|
|
|
|
|
|
Because the ClinicalTrials.gov website serves as a central point of information
|
|
|
|
|
|
|
|
on which trials are active or recruting for a given condition or drug,
|
|
|
|
|
|
|
|
most trials are updated multiple times during their progression.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There are two primary ways to access data about clinical trials.
|
|
|
|
|
|
|
|
The first is to search individual trials on ClinicalTrials.gov with a web browser.
|
|
|
|
|
|
|
|
This web portal shows the current information about the trial and provides
|
|
|
|
|
|
|
|
access to snapshots of previously submitted information.
|
|
|
|
|
|
|
|
Together, these features fulfill most of the needs of those seeking
|
|
|
|
|
|
|
|
to join a clinical trial.
|
|
|
|
|
|
|
|
For this project I've been able to scrape these historical records to establish
|
|
|
|
|
|
|
|
snapshots of the records provided.
|
|
|
|
|
|
|
|
%include screenshots?
|
|
|
|
|
|
|
|
The second way to access the data is through a normalized database setup by
|
|
|
|
|
|
|
|
the
|
|
|
|
|
|
|
|
\href{https://aact.ctti-clinicaltrials.org/}{Clinical Trials Transformation Initiative}
|
|
|
|
|
|
|
|
called AACT. %TODO: Get CITATION
|
|
|
|
|
|
|
|
The AACT database is available as a PostgreSQL database dump or set of
|
|
|
|
|
|
|
|
flat-files.
|
|
|
|
|
|
|
|
These dumps match a near-current version of the ClinicalTrials.gov database.
|
|
|
|
|
|
|
|
This format is ameniable to large scale analysis, but does not contain
|
|
|
|
|
|
|
|
information about the past state of trials.
|
|
|
|
|
|
|
|
I combined these two sources, using the AACT dataset to select
|
|
|
|
|
|
|
|
trials of interest and then scraping \url{ClinicalTrials.gov} to get
|
|
|
|
|
|
|
|
a timeline of each trial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%% Model Outline
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The way I use this data is to predict the final status of the trial
|
|
|
|
|
|
|
|
from the snapshots that were taken, in effect asking:
|
|
|
|
|
|
|
|
``how does the probability of a termination change from the current state
|
|
|
|
|
|
|
|
of the trial if X changes?''
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%% Return to causal identification
|
|
|
|
|
|
|
|
\subsection{Causal Identification}
|
|
|
|
|
|
|
|
|
|
|
|
Because running experiments on companies running clinical trials is not going
|
|
|
|
Because running experiments on companies running clinical trials is not going
|
|
|
|
to happen anytime soon, causal identification depends on using an observational
|
|
|
|
to happen anytime soon, causal identification depends on using a
|
|
|
|
approach and a structural causal model.
|
|
|
|
structural causal model.
|
|
|
|
Because the data generating process for the clinical trials records is rather
|
|
|
|
Because the data generating process for the clinical trials records is rather
|
|
|
|
straightforward, this is an ideal place to use
|
|
|
|
straightforward, this is an ideal place to use
|
|
|
|
\authorcite{pearl_causality_2000}
|
|
|
|
\authorcite{pearl_causality_2000}
|
|
|
|
@ -84,7 +146,6 @@ and provides some hypotheses that can be tested to ensure the model is
|
|
|
|
reasonably correct.
|
|
|
|
reasonably correct.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
|
|
|
|
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
|
|
|
|
my proposed data generating process,
|
|
|
|
my proposed data generating process,
|
|
|
|
It revolves around the decisions made by the study sponsor,
|
|
|
|
It revolves around the decisions made by the study sponsor,
|
|
|
|
@ -130,6 +191,7 @@ A quick summary of the nodes of the DAG, the exact representation in the data, a
|
|
|
|
\begin{enumerate}
|
|
|
|
\begin{enumerate}
|
|
|
|
\item \texttt{Will Terminate?}:
|
|
|
|
\item \texttt{Will Terminate?}:
|
|
|
|
If the final status of the trial was \textit{terminated}
|
|
|
|
If the final status of the trial was \textit{terminated}
|
|
|
|
|
|
|
|
and comes from the AACT dataset.
|
|
|
|
or \textit{completed}.
|
|
|
|
or \textit{completed}.
|
|
|
|
\item \texttt{Enrollment Status}:
|
|
|
|
\item \texttt{Enrollment Status}:
|
|
|
|
This describes the current enrollment status of the snapshot, e.g.
|
|
|
|
This describes the current enrollment status of the snapshot, e.g.
|
|
|
|
@ -147,19 +209,25 @@ A quick summary of the nodes of the DAG, the exact representation in the data, a
|
|
|
|
\begin{enumerate}
|
|
|
|
\begin{enumerate}
|
|
|
|
\item \texttt{Condition}:
|
|
|
|
\item \texttt{Condition}:
|
|
|
|
The underlying condition, classified by IDC-10 group.
|
|
|
|
The underlying condition, classified by IDC-10 group.
|
|
|
|
This impacts every other aspect of the model.
|
|
|
|
This impacts every other aspect of the model and is pulled from
|
|
|
|
|
|
|
|
the AACT dataset.
|
|
|
|
\item \texttt{Population (market size)}:
|
|
|
|
\item \texttt{Population (market size)}:
|
|
|
|
Multiple measures of the impact the disease.
|
|
|
|
Multiple measures of the impact the disease.
|
|
|
|
These are measured by the DALY cost of the disease in countries that have a
|
|
|
|
These are measured by the DALY cost of the disease, and is
|
|
|
|
High, High-Medium, Medium, Medium-Low, and Low development scores.
|
|
|
|
separated by the impact on countries with
|
|
|
|
|
|
|
|
High, High-Medium, Medium, Medium-Low, and Low
|
|
|
|
|
|
|
|
development scores.
|
|
|
|
This data comes from the Institute for Health Metrics' Global Burden of Disease study.
|
|
|
|
This data comes from the Institute for Health Metrics' Global Burden of Disease study.
|
|
|
|
\item \texttt{Elapsed Duration}:
|
|
|
|
\item \texttt{Elapsed Duration}:
|
|
|
|
A normalized measure of the time elapsed in the trial.
|
|
|
|
A normalized measure of the time elapsed in the trial.
|
|
|
|
Comes from the original estimate of the trial's primary completion date and the registered start date.
|
|
|
|
Comes from the original estimate of the trial's primary completion date and the registered start date.
|
|
|
|
I take the difference in days between these, and get the percentage of that time that has elapsed.
|
|
|
|
I take the difference in days between these, and get the percentage of that time that has elapsed.
|
|
|
|
|
|
|
|
This calculation is based on data from the snapshots and the
|
|
|
|
|
|
|
|
AACT final results.
|
|
|
|
\item \texttt{Decision to Proceed with Phase III}:
|
|
|
|
\item \texttt{Decision to Proceed with Phase III}:
|
|
|
|
If the compound development has progressed to Phase III.
|
|
|
|
If the compound development has progressed to Phase III.
|
|
|
|
This is included in the analysis by only including Phase III trials.
|
|
|
|
This is included in the analysis by only including
|
|
|
|
|
|
|
|
Phase III trials registered in the AACT dataset.
|
|
|
|
\end{enumerate}
|
|
|
|
\end{enumerate}
|
|
|
|
\item Unobserved Confounders (White Boxes)
|
|
|
|
\item Unobserved Confounders (White Boxes)
|
|
|
|
\begin{enumerate}
|
|
|
|
\begin{enumerate}
|
|
|
|
@ -168,14 +236,22 @@ A quick summary of the nodes of the DAG, the exact representation in the data, a
|
|
|
|
Cannot be observed, only estimated through scientific study.
|
|
|
|
Cannot be observed, only estimated through scientific study.
|
|
|
|
\item \texttt{Previously observed Efficacy and Safety}:
|
|
|
|
\item \texttt{Previously observed Efficacy and Safety}:
|
|
|
|
The information gathered in previous studies.
|
|
|
|
The information gathered in previous studies.
|
|
|
|
This is not available in my dataset because I don't have links to prior studies.
|
|
|
|
This is not available in my dataset because I don't
|
|
|
|
|
|
|
|
have links to prior studies.
|
|
|
|
\item \texttt{Currently observed Efficiency and Safety}:
|
|
|
|
\item \texttt{Currently observed Efficiency and Safety}:
|
|
|
|
The information gathered during this study.
|
|
|
|
The information gathered during this study.
|
|
|
|
This is only partially available, and so is treated as unavailable.
|
|
|
|
This is only partially available, and so is
|
|
|
|
After a study is over, the investigators are supposed to publish information about adverse events.
|
|
|
|
treated as unavailable.
|
|
|
|
|
|
|
|
After a study is over, the investigators are
|
|
|
|
|
|
|
|
often publish information about adverse events, but only
|
|
|
|
|
|
|
|
those that meet a certain threshold.
|
|
|
|
|
|
|
|
As this information doesn't appear to be provided to
|
|
|
|
|
|
|
|
participants, we don't consider it.
|
|
|
|
\end{enumerate}
|
|
|
|
\end{enumerate}
|
|
|
|
\end{itemize}
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%
|
|
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
\begin{itemize}
|
|
|
|
\item Relationships of interest
|
|
|
|
\item Relationships of interest
|
|
|
|
\begin{enumerate}
|
|
|
|
\begin{enumerate}
|
|
|
|
|