JobMarketPaper/Paper/sections/12_clinical_trial_backgroun...

\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}

\begin{document}

% Clinical Trials Background Outline
% - ClinicalTrials.gov
% - Clincial trial progression
% -
% -
% -
% -
% -
%   -
%   -

To understand why clinical trials succeed or fail requires understanding how
they operate and how their progress is documented.
The primary source of this operational data is ClinicalTrials.gov, where
investigators record key information about their trials' status and progression.
To understand how my administrative data captures trial progression, we'll
examine how investigators document their trials' states and transitions.
Figure \ref{Fig:Stages} is a flowchart of definitions of the different states
that a trial can take and the decisions leading to each.
It also describes the knowledge obtained by the study operator
and how that influences further decisions.
The states are standardized and defined by the National Library of Medicine
\cite{usnlm_protocolregistrationdata_2024-06-17}.
During the prior to a study, the trial investigators will design the trial,
choose primary and secondary objectives,  and decide on how many participants
they need to enroll.
 Once they have decided on these details, they post the trial to
\url{ClinicalTrials.com} and decide on a date to begin enrolling trial
participants.
% If the investigators decide to not continue with the trial before enrolling any
% participants, the trial is marked as ``Withdrawn''.
% If they begin enrolling participants, there are two methods to do so.
% The first is to enter an "Enrollment by invitation only" state where the
% trial operators extend invitations through their own connections to doctors
% and patients they are working with.
% The second is to enter a general ``Recruiting'' state, where participants apply
% to join the trial, and the sponsoring organization may extend invitations as
% before.
After a trial has enrolled enough participants, the sponsor will  move to an
"Active, not recruiting" state to inform potential participants that they have
recruiting.
During this time, the trial operators continue monitoring participants for
adverse events and tracking their disease severity and compliance with treatment.
Finally, when the investigators have obtained enough data to achieve their primary
objective, the clinical trial will be closed and marked as ``Completed'' in
\url{ClinicalTrials.gov}
If the trial is closed before achieving the primary objective, the trial is
marked as ``Terminated'' on
\url{ClinicalTrials.gov}.
Trials can be terminated because safety or efficacy evidence suggested it was
not worth continuing, enrollment rates were too low to achieve the primary
objective within time and budget contstraints.

\begin{figure}%[H] %use [H] to fix the figure here.
    \includegraphics[width=\textwidth]{../assets/img/ClinicalTrialStagesAndStatuses}
    \par \small
        Diamonds represent decision points while
	Squares represent states of the clinical trial and Rhombuses represent data obtained by the trial.
    \caption[Clinical Trial Stages and Progression]{Clinical Trial Stages and Progression}
    \label{Fig:Stages}
\end{figure}

% Note the information we obtain about the trial from the final status:
% ``Withdrawn'', ``Terminated'', or ``Completed''.
As a trial goes through the different stages of recruitment, the investigators
update the records on ClinicalTrials.gov.
Even though there are only a few times that investigators are required
to update this information, it tends to be updated somewhat regularly during
enrollment as it is a way to communicate with potential enrollees.
When a trial is first posted, it includes information
such as planned enrollment,
planned end dates,
the sites at which it is being conducted,
the diseases that it is investigating,
the drugs or other treatments that will be used,
and who is sponsoring the trial.
As enrollment is opened and closed and sites are added or removed,
investigators will update the status and information
to help doctors and potential participants understand whether they should apply.

When a trial ends, it can end in one of three ways.
The most desirable outcome is completion, where the trial achieves its
primary objective by gathering sufficient data about safety and efficacy.
However, trials may also end early either through withdrawal
(as mentioned previously)
or termination.
Termination occurs after enrollment has begun but before achieving the
primary objective.


Understanding why trials terminate early is the key goal of this work, but
is not straightforward.
Terminated trials typically record a
description of \textit{a single} reason for the clinical trial termination.
This doesn't necessarily list all the reasons contributing to the trial
termination and may not exist for a given trial.
As an example, if a Principal Investigator leaves for another institution
(terminating the trial), this decision may be affected by things such as
a safety or efficacy concern,
a new competitor on the market,
difficulties recruiting participants,
or a lack of financial support from the study sponsor.
In this way, the stated reason may mask the underlying challenges that
led to the termination, leaving us to
use another way to infer the relative impact of operational difficulties.


\todo{move the following}
To better describe termination causes, I suggest classifying them into
three broad categories.
The first category, Safety or Efficacy concerns, occurs when data suggests
the treatment is unsafe or unlikely to achieve its therapeutic goals.
While Khmelnitskaya
\cite{khmelnitskaya_competitionattritiondrug_2021}
describes these as scientific failures, I contend that they represent successful
knowledge gathering - the clinical trial process working as intended to
identify ineffective treatments.
The second category, Strategic concerns, encompasses business and
market-driven decisions such as changes in company priorities or
competitive landscape.
The final category, Operational concerns, includes practical challenges
like insufficient enrollment rates or loss of key personnel.
These latter two categories represent true failures of the trial process,
as they prevent us from learning whether the treatment would have
been safe and effective.


\subsection{Literature on Clinical Trials}\label{SEC:LitReview}

%Describe how clinical trials fit into the drug development landscape and how they proceed
Clinical trials are a required part of drug development.
Not only does the FDA require that a series of clinical trials demonstrate sufficient safety and efficacy of
a novel pharmaceutical compound or device, producers of derivative medicines may be required to ensure that
their generic small molecule compound -- such as ibuprofen or levothyroxine -- matches the
performance of the originator drug if delivery or dosage is changed.
For large molecule generics (termed biosimilars) such as Adalimumab
(Brand name Humira, with biosimilars Abrilada, Amjevita, Cyltezo, Hadlima, Hulio,
Hyrimoz, Idacio, Simlandi, Yuflyma, and Yusimry),
the biosimilars are required to prove they have similar efficacy and safety to the
reference drug.


In the world of drug development, these trials are classified into different
phases of development\footnote{
\cite{anderson_fdadrugapproval_2022}
provide an overview of this process
while
\cite{commissioner_drugdevelopmentprocess_2020}
describes the process in detail.}.
Pre-clinical studies primarily establish toxicity and potential dosing levels.
% \cite{commissioner_drugdevelopmentprocess_2020}.
Phase I trials are the first attempt to evaluate safety and efficacy in humans.
Participants typically are healthy individuals, and they measure how the drug
affects healthy bodies, potential side effects, and adjust dosing levels.
Sample sizes are often less than 100 participants.
% \cite{commissioner_drugdevelopmentprocess_2020}.
Phase II trials typically involve a few hundred participants and is where
investigators will dial in dosing, research methods, and safety.
% \cite{commissioner_drugdevelopmentprocess_2020}.
A Phase III trial is the final trial before approval by the FDA, and is where
the investigator must demonstrate safety and efficacy with a large number of
participants, usually on the order of hundreds or thousands.
% \cite{commissioner_drugdevelopmentprocess_2020}.
Occasionally, a trial will be a multi-phase trial, covering aspects of either
Phases I and II or Phases II and III.
After a successful Phase III trial, the sponsor will decide whether or not
to submit an application for approval from the FDA.
Before filing this application, the developer must have completed
``two large, controlled clinical trials.''
% \cite{commissioner_drugdevelopmentprocess_2020}.
Phase IV trials are used after the drug has received marketing approval to
validate safety and efficacy in the general populace.
Throughout this whole process, the FDA is available to assist in decision-making
regarding topics such as study design, document review, and whether
they should terminate the trial.
The FDA also reserves the right to place a hold on the clinical trial for
safety or other operational concerns, although this is rare.
\cite{commissioner_drugdevelopmentprocess_2020}.


In the economics literature, most of the focus has been on describing how
drug candidates transition between different phases and their probability
of final approval.
% Lead into lit review
% Abrantes-Metz, Adams, Metz (2004)
\authorcite{abrantes-metz_pharmaceuticaldevelopmentphases_2004}
described the relationship between
various drug characteristics and how the drug progressed through clinical trials.
% This descriptive estimate was notable for using a
% mixed state proportional hazard model and estimating the impact of
% observed characteristics in each of the three phases.
They found that as Phase I and II trials last longer,
the rate of failure increases.
In contrast, Phase 3 trials generally have a higher rate of
success than failure after 91 months.
This may be due to the fact that the purpose of Phases I and II are different
from the purpose of Phase III.

Continuing on this theme,
%DiMasi FeldmanSeckler Wilson 2009
\authorcite{dimasi_trendsrisksassociated_2010}
examine the completion rate of clinical drug
development and find that for the 50 largest drug producers,
approximately 19\% of their drugs under development between 1993 and 2004
successfully moved from Phase I to receiving an New Drug Application (NDA)
or Biologics License Application (BLA).
They note a couple of changes in how drugs are developed over the years they
study, most notably that
drugs began to fail earlier in their development cycle in the
latter half of the time they studied.
They note that this may reduce the cost of new drugs by eliminating late
and costly failures in the development pipeline.

Earlier work by
\authorcite{dimasi_valueimprovingproductivity_2002}
used data on 68 investigational drugs from 10 firms to simulate how reducing
time in development reduces the costs of developing drugs.
He estimates that reducing Phase III of clinical trials by one year would
reduce total costs by about 8.9\% and that moving 5\% of clinical trial failures
from phase III to Phase II would reduce out of pocket costs by 5.6\%.

A key contribution to this drug development literature is the work by
\authorcite{khmelnitskaya_competitionattritiondrug_2021}
who created a causal identification strategy
to disentangle strategic exits from exits due to clinical failures
in the drug development pipeline.
She found that overall 8.4\% of all pipeline exits are due to strategic
terminations and that the rate of new drug production would be about 23\%
higher if those strategic terminatations were eliminated.

The work that is closest to mine is the work by
\authorcite{hwang_failureinvestigationaldrugs_2016}
who investigated causes for which late stage (Phase III)
clinical trials fail -- with a focus on trials in the USA,
Europe, Japan, Canada, and Australia.
They identified 640 novel therapies and then studied each therapy's
development history, as outlined in commercial datasets.
They found that for late stage trials that did not go on to receive approval,
57\% failed on efficacy grounds, 17\% failed on safety grounds, and 22\% failed
on commercial or other grounds.

Unfortunately the work of both
\authorcite{hwang_failureinvestigationaldrugs_2016}
and
\authorcite{khmelnitskaya_competitionattritiondrug_2021}
ignore a potentially large cause of failures: operational challenges, i.e. when
issues running or funding the trial cause it to fail before achieving its
primary objective.
In a personal review of 199 randomly selected clinical trials which terminated
before achieving their primary objective,
I found that
14.5\% cited safety or efficacy concerns,
9.1\% cited funding problems (an operational concern),
and
31\% cited enrollment issues (a separate operational concern)\footnote{
Note that these figures differ from
\authorcite{hwang_failureinvestigationaldrugs_2016}
because I sampled from all stages of trials, not just Phase III trials
focused on drug development.
}.


\subsection{Introduction to \href{https://ClinicalTrials.gov}{ClinicalTrials.Gov}}


%% Describe data here
Since Sep 27th, 2007 those who conduct clinical trials of FDA controlled
drugs or devices on human subjects must register
their trial at \url{ClinicalTrials.gov}
(\cite{anderson_fdadrugapproval_2022}).
This involves submitting information on the expected enrollment and duration of
trials, drugs or devices that will be used, treatment protocols and study arms,
as well as contact information the trial sponsor and treatment sites.

When starting a new trial, the required information must be submitted
``\dots not later than 21 calendar days after enrolling the first human subject\dots''.
After the initial submission, the data is briefly reviewed for quality and
then the trial record is published and the trial is assigned a
National Clinical Trial (NCT) identifier.
(\cite{anderson_fdadrugapproval_2022}).

Each trial's record is updated periodically, including a final update that must occur
within a year of completing the primary objective, although exceptions are
available for trials related to drug approvals or for trials with secondary
objectives that require further observation\footnote{This rule came into effect in 2017}
(\cite{anderson_fdadrugapproval_2022}).
Other than the requirements for the first and last submissions, all other
updates occur at the discresion of the trial sponsor.
Because the ClinicalTrials.gov website serves as a central point of information
on which trials are active or recruting for a given condition or drug,
most trials are updated multiple times during their progression.

There are two primary ways to access data about clinical trials.
The first is to search individual trials on ClinicalTrials.gov with a web browser.
This web portal shows the current information about the trial and provides
access to snapshots of previously submitted information.
Together, these features fulfill most of the needs of those seeking
to join a clinical trial.
For this project I've been able to scrape these historical records to establish
snapshots of the records provided.
%include screenshots?
The second way to access the data is through a normalized database setup by
the
\href{https://aact.ctti-clinicaltrials.org/}{Clinical Trials Transformation Initiative}
called AACT. %TODO: Get CITATION
The AACT database is available as a PostgreSQL database dump or set of
flat-files.
These dumps match a near-current version of the ClinicalTrials.gov database.
This format is ameniable to large scale analysis, but does not contain
information about the past state of trials.
I combined these two sources, using the AACT dataset to select
trials of interest and then scraping \url{ClinicalTrials.gov} to get
a timeline of each trial.
The result is a series of snapshots, each documenting a specific set of
recorded changes in a trial.
It is these snapshots that provide the opportunity to estimate the
data generating process corresponding to the clinical trials for
which I have data.

%%%%%%%%%%%%%%%%%%%%%%%% Model Outline

% The way I use this data is to predict the final status of the trial
% from the snapshots that were taken, in effect asking:
% ``how does the probability of a termination change from the current state
% of the trial if X changes?''
% -
% -
% -
% -
% -
% -
%

\end{document}