Compare commits

..

59 Commits
v1.0.0 ... main

Author SHA1 Message Date
Will King 52a88bcd61 Merge branch 'rewrite_section' 1 year ago
will king b4c9052fd1 Updated estimation 1 year ago
will king 9238da8d6a recording most recent updates 1 year ago
Will King 28b6404301 added details to causal story, enabled latex todos, updated laptop layout 1 year ago
Will King 35497f6fbd Merge branch 'rewrite_section' of https://git.youainti.com/youainti/ClinicalTrialsPaper into rewrite_section 1 year ago
will king 46cc82d8d3 Merge branch 'rewrite_section' of https://git.youainti.com/youainti/ClinicalTrialsPaper into rewrite_section 1 year ago
will king 9eaf8a6746 adding current work 1 year ago
Will King 22991aaf90 fixed paths in zellij layout for laptop 1 year ago
Will King 0a1eaeb468 changed zellij layout for laptop 1 year ago
will king 97cb6c03a8 Continued work on Causal story, minor changes to data section 1 year ago
will king 6cedf8832b adding zellij layout because it is helpful 1 year ago
will king e772309f67 recording current work 1 year ago
will king a9a6c4b224 saving current work and plans 1 year ago
Will King b59e05576c Finished introduction, added new background section. 1 year ago
Will King 340b1694a2 Merge branch 'rewrite_section' of /run/media/will/Ventoy/git_repos/jmp_remote into rewrite_section 1 year ago
Will King bee4bff18a minor adjustment to outline2 1 year ago
will king e54ef2e2c0 commmented out some lit that doesn't go in the intro. 1 year ago
will king ba7fddc5bb rewrote the introduction and introductory literature review. Fixed bib. 1 year ago
Will King b87b8c3db1 updated outline2 (new introduction), and added sql used to get information 1 year ago
Will King 94605e8c19 new outline to begin writing introdution and lit review 1 year ago
will king 676394f480 Created plan in outline.txt, added resources, and updated some sections. 2 years ago
will king 0ad55d2d54 Adjusted title, planned reorder of data/causal section. 2 years ago
will king c0b963fc07 minor changes 2 years ago
will king c5a7d495d1 more lit review work 2 years ago
will king 08fff0c078 Continuing to work on introduction and lit review 2 years ago
will king ae5aee1326 added plans to structure the lit review 2 years ago
will king 4d0b941f34 updated data processing and estimation 2 years ago
will king 8d164d2cb7 added merge on data processing stuff 2 years ago
will king 35fc36ce31 Added small updates 2 years ago
Will King 0f7aea88ff quick note about something to explore 2 years ago
will king ea52db78ed tracking updates to data and analysis 2 years ago
will king 5a79c7cf73 Merge branch 'main' of https://git.youainti.com/youainti/ClinicalTrialsPaper 2 years ago
will king 1c4af58cf5 updating lit review and data stuff 2 years ago
will king dc883d88b0 Merge branch 'main' of ssh://git.youainti.com:3022/youainti/ClinicalTrialsPaper 2 years ago
will king f73b05f181 Tracking changes from processing and estimation 2 years ago
will king 59633bf072 minor tweak to intro 2 years ago
will king 1556fe0eba updated lit review and other minor changes 2 years ago
Will King 3888cd194a added notes to myself about framing 2 years ago
Will King 1a83bac8b2 another minor edit to intro 2 years ago
Will King ffee9529fa began rewriting introduction 2 years ago
Will King 332a27ea22 added tables and references to intro 2 years ago
Will King 47d625e618 updated introduction 2 years ago
will king 526a7bd8e2 merge 2 years ago
will king 6fa7863d1f getting things up to date 2 years ago
will king 36d8d3a39f added some of the data linkers I used. 2 years ago
Will King 0b46692950 Merge branch 'main' of https://git.youainti.com/youainti/ClinicalTrialsPaper 2 years ago
Will King 15752e4150 got a reasonable draft 2 years ago
Will King bd245b73cb Previous version of presentation etc 2 years ago
Will King 3564b0a8b4 updating submodules 2 years ago
Will King e581d51c7a Presentation Version 3 years ago
will king 8560435ec3 Saving updates, added new causal graph 3 years ago
Will King 0e3e8e6b66 updated presentation and paper 3 years ago
will king 03f9c6081d removed old images, got presentation mostly there 3 years ago
will king 780da29dee removing dvi 3 years ago
Will King 755cc80a4a updated dependency on estimation 3 years ago
Will King 5a27e4a567 merged in past version of presentation. Turns out I had left it on git and had not merged it. 3 years ago
Will King 8c46418c5f added current presentation draft 3 years ago
Will King 94f16777fc added initial background information on clinical trials 3 years ago
will king a26f87ea72 updated remotes 3 years ago

3
.gitignore vendored

@ -12,3 +12,6 @@
## Ignore PDfs ## Ignore PDfs
*.pdf *.pdf
*.dvi
#ignore swap files
*.swp

11
.gitmodules vendored

@ -1,6 +1,9 @@
[submodule "ClinicalTrialsDataProcessing"]
path = ClinicalTrialsDataProcessing
url = ssh://gitea@gitea.kgjk.icu:3022/Research/ClinicalTrialsDataProcessing.git
[submodule "ClinicalTrialsEstimation"] [submodule "ClinicalTrialsEstimation"]
path = ClinicalTrialsEstimation path = ClinicalTrialsEstimation
url = ssh://gitea@gitea.kgjk.icu:3022/Research/ClinicalTrialsEstimation.git url = https://git.youainti.com/youainti/ClinicalTrialsEstimation.git
[submodule "ClinicalTrialsDataProcessing"]
path = ClinicalTrialsDataProcessing
url = https://git.youainti.com/youainti/ClinicalTrialsDataProcessing.git
[submodule "ClinicalTrials_DataLinkers"]
path = ClinicalTrials_DataLinkers
url = https://git.youainti.com/Research/ClinicalTrials_DataLinkers.git

@ -1 +1 @@
Subproject commit a2c0e4dcc70a70041e4895698c9dd856defdb7ed Subproject commit 3311159ab63a459fd01b21fe38a8dd888f850734

@ -1 +1 @@
Subproject commit 09d0faa84c30f0735b0a16a3159afd2d816e9296 Subproject commit d25f5c2a0e672c361937e8c3b490a575714b8ec1

@ -0,0 +1 @@
Subproject commit 363dc5e3da77d56934fae6c0f7302d3f58e779d1

@ -12,6 +12,7 @@
\input{../assets/preambles/GeneralPreamble} \input{../assets/preambles/GeneralPreamble}
\usepackage{float} \usepackage{float}
\usepackage{csquotes}
%setup paragraph level indexing %setup paragraph level indexing
@ -23,9 +24,13 @@
\titlespacing*{\paragraph} \titlespacing*{\paragraph}
{0pt}{3.25ex plus 1ex minus .2ex}{1.5ex plus .2ex} {0pt}{3.25ex plus 1ex minus .2ex}{1.5ex plus .2ex}
\title{The effects of market conditions on enrollment and completion of clinical trials\\ \small{Preliminary Draft}} \title{The effects of market conditions and enrollment on the
completion of clinical trials\\ \small{Preliminary Draft}}
\author{William King} \author{William King}
\usepackage{multirow}
\usepackage{multicol}
\begin{document} \begin{document}
\maketitle \maketitle
@ -40,15 +45,27 @@
\section{Introduction}\label{SEC:Introduction} \section{Introduction}\label{SEC:Introduction}
%--------------------------------------------------------------- %---------------------------------------------------------------
\subfile{sections/01_introduction} \subfile{sections/11_intro_and_lit}
%--------------------------------------------------------------- % \subfile{sections/01_introduction}
\section{Literature Review}\label{SEC:LiteratureReview} % %---------------------------------------------------------------
%--------------------------------------------------------------- % \section{Literature Review}\label{SEC:LiteratureReview}
\subfile{sections/05_LitReview} % %---------------------------------------------------------------
% \subfile{sections/05_LitReview}
\section{Clincal Trial Background}\label{SEC:ClinicalTrials}
\subfile{sections/12_clinical_trial_background}
The paper proceeds as follows.
Then section \ref{SEC:data} covers the data sources and the proposed
data generating process as well as the causal identification.
Section \ref{SEC:EconometricModel} describes the econometric model
used.
Section \ref{SEC:Results} discusses the results of the analysis.
\todo{Review this after writing a few mor sections.}
%--------------------------------------------------------------- %---------------------------------------------------------------
\section{Data}\label{SEC:Data} \section{Causal Story and Data}\label{SEC:Data}
%--------------------------------------------------------------- %---------------------------------------------------------------
\subfile{sections/10_CausalStory}
\subfile{sections/02_data} \subfile{sections/02_data}
%--------------------------------------------------------------- %---------------------------------------------------------------

@ -0,0 +1,84 @@
layout {
tab name="Main and Compile" cwd="~/research/PhD_Deliverables/jmp/Latex/Paper/" hide_floating_panes=true focus=true {
// This tab is where I manage main from.
// it opens up Main.txt for my JMP, opens the pdf in okular (in a floating tab), and then get's ready to build the pdf.
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane edit="Main.tex" focus=true // This is the editor
pane split_direction="horizontal" {
// this is the compilation window
pane size="60%" command="compiletex" {
args "Main.tex"
start_suspended true
}
// This is the ls of sections
pane size="35%" command="ls"{
args "sections/"
}
}
}
floating_panes {
// here is where I run okular from, it is auto hidden
pane command="okular" {
args "Main.pdf"
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
tab name="sections" cwd="~/research/PhD_Deliverables/jmp/Latex/Paper/sections/" {
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane
pane stacked=true {
pane
pane
pane
pane
pane
pane
pane
pane
pane
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
tab name="git" cwd="~/research/PhD_Deliverables/jmp/Latex/Paper/" {
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane split_direction="horizontal" {
pane command="watch" {
args "--color" "git status"
// requires `git config --global color.status always` to be set
}
pane size="30%" {
focus true
}
}
pane command="git" {
args "log" "-n 10" "--all" "--oneline" "--graph" "--stat" "--decorate"
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
}

@ -0,0 +1,84 @@
layout {
tab name="Main and Compile" cwd="~/research/phd_deliverables/jmp/Latex/Paper" hide_floating_panes=true focus=true {
// This tab is where I manage main from.
// it opens up Main.txt for my JMP, opens the pdf in okular (in a floating tab), and then get's ready to build the pdf.
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane edit="./Main.tex" focus=true // This is the editor
pane split_direction="horizontal" {
// this is the compilation window
pane size="60%" command="comlatex.sh" {
args "Main.tex"
start_suspended true
}
// This is the ls of sections
pane size="35%" command="ls"{
args "sections/"
}
}
}
floating_panes {
// here is where I run okular from, it is auto hidden
pane command="okular" {
args "Main.pdf"
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
tab name="sections" cwd="~/research/phd_deliverables/jmp/Latex/Paper/sections" {
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane
pane stacked=true {
pane
pane
pane
pane
pane
pane
pane
pane
pane
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
tab name="git" cwd="~/research/phd_deliverables/jmp/Latex/Paper/" {
pane size=1 borderless=true {
plugin location="tab-bar"
}
pane split_direction="vertical" {
pane split_direction="horizontal" {
pane command="watch" {
args "--color" "git status"
// requires `git config --global color.status always` to be set
}
pane size="30%" {
focus true
}
}
pane command="git" {
args "log" "-n 10" "--all" "--oneline" "--graph" "--stat" "--decorate"
}
}
pane size=2 borderless=true {
plugin location="status-bar"
}
}
}

@ -0,0 +1,18 @@
NEXT STEPS IN WRITING
- insert a description of the general approach I use:
- predicting, based on snapshots, the likelihood of termination.
- this needs to go between the description of the snapshots and the
causal inference introduction.
- Then I can use what I've written about the graph, and follow up with more information about the data.
Overall this would look like
- [x] Introduction of the question and general issues of confoundedness.
- [x] Clinical Trials Data Sources
- [x] Explain basic econometric modelling approach
- [ ] Then explain the graph, nodes, and confoundedness in more detail
- [ ] Then go over the rest of the data.
- [ ] Finally
- Discuss the number of datapoints.
- review major challenges to causal identification. (no enrollment model small data size)

@ -0,0 +1,34 @@
Outlining for jmp
<intro>
Introduction and problem statement
*Explain what I am doing:*
</intro>
<literature
Describe what has been done
- measuring failure rates & impact
Introduce different types of failure
- Scientific
- Strategic
- Operational
Efforts to measure failures
Medbio story to illuistrate failure modes.
Operational and strategic failures undermine scientific process of discovery
*My effort is to separate...*: place my work in context
Introduce clinical trials' progressions, stages, and statuses.
</literature>
<causal model>
Derive causal model
</causal model>
<data>
Summarize data sources
</data>
<econometrics>
Introduce econometric model
</econmetrics>
<results>
Discuss econometric results
</results>
Conclusion
Appendicies
- in-depth data source info
- More econometric results

@ -0,0 +1,58 @@
In 19xx the United States Food and Drug Administration (FDA) was created to "QUOTE".
As of Sept 2022 \todo{Check Date} they have approved 6,602 currently-marketed compounds with Structured Product Labels (SPL)
and 10,983 previously-marketed SPLs.
%from nsde table. Get number of unique application_nubmers_or_citations with most recent end date as null.
In 2007, they began requiring that drug developers register and publish clinical trials on \url{https://clinicaltrials.gov}.
This provides a public mechanism where clinical trial sponsors are responsible to explain
what they are trying to acheive and how it will be measured, as well as provide the public the ability to
search and find trials that they might enroll in.
Data such as this has become part of multiple datasets
(e.g. the Cortellis Investigational Drugs dataset or the AACT dataset from the Clinical Trials Transformation Intiative)
used to evaluate what drugs might be entering the market soon.
This brings up a question: can we use this public data on clinical trials to describe what effects their success or failure?
In this work, I use updates to records on \url{https://ClinicalTrials.gov} to disentangle
the effect of participant enrollment and drugs on the market affect the success or failure of clinical trials.
%Describe how clinical trials fit into the drug development landscape and how they proceed
Clinical trials are a required part of drug development.
Not only does the FDA require that a series of clinical trials demonstrate sufficient safety and efficacy of
a novel pharmaceutical compound or device, producers of derivative medicines may be required to ensure that
their generic small molecule compound -- such as ibuprofen or levothyroxine -- matches the
performance of the originiator drug if delivery or dosage is changed.
For large molecule generics (termed biosimilars) such as Adalimumab
(Brand name Humira, with biosimilars Abrilada, Amjevita, Cyltezo, Hadlima, Hulio,
Hyrimoz, Idacio, Simlandi, Yuflyma, and Yusimry),
the biosimilars are required to prove they have similar efficacy and safety to the
reference drug.
When registering a clinical trial,
the investigators are required to
% discuss how these are registered and what data is published.
% Include image and discuss stages
% Discuss challenges faced
% Introduce my work
In the world of drug development, these trials are classified into different phases of development.
Pre-clinical studies may include
Phase I trials are the first attempt to evaluate safety and efficacy in humans, and usually \todo{Describe trial phases, get citation}
Phase II trials typically \todo{}
A Phase III trial is the final trial befor approval by the FDA
Phase IV trials are used after approval to ensure safety and efficacy in the general populace ....
In the economics literature, most of the focus has been on evaluating how drug candidates transition between
different phases and then on to approval.
% Now begin introducing work by Chris Adams
% Lead into lit review
% Causality
% Data
% Economic Model
% Results
% Conclusion

@ -0,0 +1,42 @@
How do I begin work on stuff
- next step is causal story. key points include
- we are trying to separate strategic and operational concerns. (why is this a difficult problem?)
- we can't trust what we are told
- terminations could be due to safety, strategic, or operational concerns.
- explaining confounding between
- population/market and enrollment.
-population/market and market conditions.
- market conditions and enrollment.
- describe other confounders
- safety and effectiveness
- duration <--> enrollment/termination
- Condition
- Decision to procede with Phase III trial
- How do I handle this?
- Introduce Do-Calculus
- DAG model
- What do I need to control for, in some form or other?
CURRENTLY HERE:
- Introduce Data
- Clinical Trial Progression
- AACT gives us information on
- terminated/completed status
- compound-indication pairs
- MeSH/RxNorm links
- Snapshots
- Market Conditions
- can't directly measure alternate treatments/standards of care.
- Can get measures of USP - formulary alternatives
- Can get number of generics or brand names with same drug.
- Population Sizes
- IHME Global Burden of Disease dataset. Best measure of impact of a given disease category.
- DALY's
- How much data do I have?
- Econometric model
- for a given state, what is the probability it will terminate?
- more accurately for my dist-diff analysis: for a given state, what is the distribution of the probabilities it will terminate?
- basic bernoulli-logistic model, linear in parameters.
- What are the specific things I am looking at?
- number of competing treatments.
- delaying close of enrollment.

@ -0,0 +1,98 @@
\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}
\begin{document}
% hook - what makes drugs expensive? Mention high failure rate
% describe current research
% - Examine mechanisms by which clinical trials fail.
% - Mention data
% - Results
How to best address the high cost of pharmaceuticals is a crucial health
and fiscal policy question that has been debated for
decades.
Due to the complicated legal and competitive landscape, unintended consequences
are common
\cite{vandergronde_addressingchallengehighpriced_2017}.
One essential step to introduce a novel pharmaceutical - or even
to begin selling a generic compound - is to establish that the drug as packaged and sold will
have acceptable safety and efficacy profiles.
When evaluating these compounds in a clinical trial, multiple outcomes are possible:
\begin{enumerate}
\item The compound demonstrates sufficient safety and efficacy, and proceeds in the appoval process.
\label{Item:EndSuccess}
\item The compound fails to demonstrate sufficient safety and efficacy, and the approval process halts.
\label{Item:EndFail}
\item The trial is terminated before it can acheive one of the first two
outcomes, for reasons unrelated to safety and efficacy concerns.
\label{Item:Terminate}
\end{enumerate}
\begin{table}
\caption{Potential States of Knowledge from a clinical trial}\label{tab:StatesOfKnowledge}
\begin{center}
\begin{tabular}{p{0.15\textwidth} p{0.2\textwidth}||p{0.25\textwidth}|p{0.25\textwidth}|}
\cline{3-4}
\multicolumn{2}{c|}{Drug-Indication Match} & safe and efficacious & not safe or not efficatious \\
\hline
\hline
\multirow{2}{0.15\textwidth}{Operations} & Success & Known good & Known bad \\
\cline{2-4}
& Failure & \multicolumn{2}{c|}{Unkown} \\
\cline{2-4}
\end{tabular}
\end{center}
\end{table}
\begin{table}
\caption{Clinical Trial end states}\label{tab:ClinicalTrialEndStates}
\begin{center}
\begin{tabular}{p{0.15\textwidth} p{0.2\textwidth}||p{0.25\textwidth}|p{0.25\textwidth}|}
\cline{3-4}
\multicolumn{2}{c|}{Drug-Indication Match} & safe and efficacious & not safe or not efficatious \\
\hline
\hline
\multirow{2}{0.15\textwidth}{Operations} & Success & Completion & Completion or Termination \\
\cline{2-4}
& Failure & \multicolumn{2}{c|}{Termination} \\
\cline{2-4}
\end{tabular}
\end{center}
\end{table}
While it is known that pharmaceutical companies withdraw some drugs from
their development pipeline due to commercialization concerns
(
\cite{khmelnitskaya_competition_2021}
and
\cite{van_der_gronde_addressing_2017}
), there are likely unseen
effects that might affect the overall drug pipleline.
One of these is the concern that when there are already approved therapies on
the market, patients might be loath to enroll in clinical trials,
causing the trial to fail for reasons unrelated to the scientific or
commercial viability of the therapy.
To adequately guide public policy it is crucial that robust, causally-identified
statistical models are available to describe the interaction between
various players within the space.
This work endeavors to estimate the change in probability of successful completion
of a clinical trial due to the existence of alternative drugs on the market.
In particular, it seeks to establish whether such an impact is mediated
by enrollment patterns or is caused more directly.
The paper proceeds as follows: a brief literature review in \cref{SEC:LiteratureReview},
a description of the caual model in \cref{SEC:CausalIdentification},
followed by a description of the data (\cref{SEC:Data}) and the
econometric model (\cref{SEC:EconometricModel}).
Preliminary results are presented in \cref{SEC:Results} and a discussion
of proposed improvements is included in \cref{SEC:Improvements}.
\end{document}

@ -2,51 +2,43 @@
\graphicspath{{\subfix{Assets/img/}}} \graphicspath{{\subfix{Assets/img/}}}
\begin{document} \begin{document}
% hook - what makes drugs expensive? Mention high failure rate
% describe current research
% - Examine mechanisms by which clinical trials fail.
% - Mention data
% - Results
How to best address the high cost of pharmaceuticals is a crucial health
and fiscal policy question that has been debated for
decades.
Due to the complicated legal and competitive landscape, unintended consequences
are common
\cite{van_der_gronde_addressing_2017}.
One critical aspect to successfully introduce a novel pharmaceutical or even
a generic compound is to establish that the drug as packaged and sold will
have acceptable safety and efficacy profiles.
This is done using clinical trials.
To adequately guide public policy it is crucial that robust, causally-identified
statistical models are available to describe the interaction between
various players within the space.
While it is known that pharmaceutical companies withdraw some drugs from
their development pipeline due to commercialization concerns
(
\cite{khmelnitskaya_competition_2021}
and
\cite{van_der_gronde_addressing_2017}
), there are likely unseen
effects that might affect the overall drug pipleline.
One of these is the concern that when there are already approved therapies on
the market, patients might be loath to enroll in clinical trials,
causing the trial to fail for reasons unrelated to the scientific or
commercial viability of the therapy.
This work endeavors to estimate the change in probability of successful completion
of a clinical trial due to the existence of alternative drugs on the market.
In particular, it seeks to establish whether such an impact is mediated
by enrollment patterns or is caused more directly.
The paper proceeds as follows: a brief literature review in \cref{SEC:LiteratureReview},
a description of the caual model in \cref{SEC:CausalIdentification},
followed by a description of the data (\cref{SEC:Data}) and the
econometric model (\cref{SEC:EconometricModel}).
Preliminary results are presented in \cref{SEC:Results} and a discussion
of proposed improvements is included in \cref{SEC:Improvements}.
Developing novel, safe, and effective pharmaceutical compounds is difficult.
Starting with challenges identifying promising treatment targets and potential compounds, to ensuring the drug can be properly delivered within the body, the scientific work that needs to go well is massive.
The regulatory and market conditions in which they exist add to this difficulty.
For example, regulations are designed to reduce the number of drugs released
to market with significan issues, such as in the case of VIOXX
\cite{krumholz_whathavewe_2007}
or the Perdue Pharma scandal
\cite{officepublicaffairsjusticedepartment_2020}.
These regulations, such as clinical trial standards
\todo{add citation to clinical trials here},
increase the costs of developing new drugs, adding to the business concerns
already present, including competitors already in the market or close to
entering and the overall demand to address a given condition.
This work is the first that endeavors to separate the causal effect
of an operational concern (participant enrollment) from that of strategic
concerns (market size and competitors in the market)
on individual clinical trials.
%begin discussing failures
%I am thinking I'll discuss marketing and operational failures
%I somehow need to step away from the drug development framing and soften it to
%... what? drug investigation?
From these general challenges we can begin to classify failures in drug
development into a hierarchy of causes.
\cite{khmelnitskaya_competitionattritiondrug_2021}
described two general causes for a drug to exit the drug-development pipline,
strategic exits and scientific failure.
Similarly
\cite{hwang_failure_2016}
ascribe failues of Phase III trials to issues with safety,
efficacy, or other (buisness) concerns.
Understanding both why and how the development of drugs fail -- for both
novel and derivative pharmaceuticals -- is key to ensuring that both innovation
and availability are maximized.

@ -3,6 +3,7 @@
\begin{document} \begin{document}
In the sections below, I examine each source of data, their key features, In the sections below, I examine each source of data, their key features,
how they match with the variables in the Structural Model DAG,
and describe applicable terminology (\cref{datasources}). and describe applicable terminology (\cref{datasources}).
I then discuss how these sources were tied together (\cref{datalinks}) and I then discuss how these sources were tied together (\cref{datalinks}) and
describe the specific data used in the analysis (\cref{dataintegration}). describe the specific data used in the analysis (\cref{dataintegration}).
@ -54,7 +55,7 @@ most trials are updated multiple times during their progression.
There are two primary ways to access data about clinical trials. There are two primary ways to access data about clinical trials.
The first is to search individual trials on ClinicalTrials.gov with a web browser. The first is to search individual trials on ClinicalTrials.gov with a web browser.
This web portal shows the current information about the trial and provides This web portal shows the current information about the trial and provides
access to snapshots of previous versions of the same information. access to snapshots of previously submitted information.
Together, these features fulfill most of the needs of those seeking Together, these features fulfill most of the needs of those seeking
to join a clinical trial. to join a clinical trial.
%include screenshots? %include screenshots?
@ -64,14 +65,13 @@ the
called AACT. %TODO: Get CITATION called AACT. %TODO: Get CITATION
The AACT database is available as a PostgreSQL database dump or set of pipe (``$\vert$'') The AACT database is available as a PostgreSQL database dump or set of pipe (``$\vert$'')
delimited files and matches the current version of the ClinicalTrials.gov database. delimited files and matches the current version of the ClinicalTrials.gov database.
This format is ameniable to large scale analysis, but does not contain information about past This format is ameniable to large scale analysis, but does not contain information about
state of trials. the past state of trials.
One of the main products of this research was the creation of a set of python scripts to I created a set of python scripts to
incorporate the historical data on clinical trials available through the web incorporate the historical data on clinical trials available through the web
portal and merge it into a local copy of the standard AACT database. portal and merge it into a local copy of the standard AACT database.
This novel dataset can be used to easily track changes across many trials, This novel dataset can be used to easily track changes as trials progresss.
particularly in the areas of enrollment and expected duration.
%describe the data NCT, trial records, mesh_terms, etc %describe the data NCT, trial records, mesh_terms, etc
In this combined dataset of current and historical trial records, there are a few In this combined dataset of current and historical trial records, there are a few
@ -112,8 +112,8 @@ areas of particular interest.
\subsubsection{Drug Compounds and Structured Product Labels (SPLs)} \subsubsection{Drug Compounds and Structured Product Labels (SPLs)}
When a drug is licensed for sale in the U.S., it is not just the active When a drug is licensed for sale in the U.S., it is not just the active
ingredients that are licensed, but also the dosage. ingredients that are licensed, but also the dosage and route of administration.
Each of these combined dosage and compound pairs is assigned a unique Each of these combined compound/dosage/route pairs are assigned a unique
National Drug Code (NDC). National Drug Code (NDC).
%mention orange book %mention orange book
The list of approved NDCs are released regularly in the FDA's The list of approved NDCs are released regularly in the FDA's
@ -277,22 +277,6 @@ It is made available through an API hosted by the NLM.
One key feature is the ability to use a basic text search to find matching One key feature is the ability to use a basic text search to find matching
terms in various terminologies. terms in various terminologies.
In order to link clinical trials to standardized ICD-10 conditions and thus
to the Global Burdens of Disease Data, I wrote a python script to search the
UMLS system for ICD-10 codes that matched the MeSH descriptions for
each trial.
This search resulted in generally three categories of search results:
\begin{enumerate}
\item The results contained a few entries, one of which was obviously correct.
\item The results contained a large number of entries, a few of which were correct.
\item The results did not contain any matches.
\end{enumerate}
In these cases I needed a way to validate each match and potentially add my own
ICD-10 codes to each trial.
To this end I build a website that allows one to quickly review and edit these
records.
The effort to manually review this data is ongoing.
\subsection{Data Integration}\label{dataintegration} \subsection{Data Integration}\label{dataintegration}
@ -307,6 +291,7 @@ Below is more information about how the data was used in the analysis.
For clinical trials, I captured each update that occured after the start date For clinical trials, I captured each update that occured after the start date
and prior to the primary completion date of the trial. and prior to the primary completion date of the trial.
For clarity I will refer to these as a snapshot of the trial. For clarity I will refer to these as a snapshot of the trial.
For each snapshot I recorded the enrollment (actual or anticipated), For each snapshot I recorded the enrollment (actual or anticipated),
the date the it was submitted, the planned primary completion date, the date the it was submitted, the planned primary completion date,
and the trial's overall status at the time. and the trial's overall status at the time.
@ -332,11 +317,34 @@ matters it is only about $[0,3]$. %good to put a graph here
I also included the current status by encoding it to dummy parameters. I also included the current status by encoding it to dummy parameters.
%Describe linking drugs/getting number of brands %Describe linking drugs/getting number of brands
As a basic measure of market conditions I have gathered the number of brands As an initial measure of market conditions I have gathered the number of brands
that are producing drugs containing the compound(s) of interest in the trial. that are producing drugs containing the compound(s) of interest in the trial.
This was done by extracting the RxCUIs that represented the drugs of interest, This was done by extracting the RxCUIs that represented the drugs of interest,
then linking those to the RxCUIs that are brands containing those ingredients. then linking those to the RxCUIs that are brands containing those ingredients.
As a secondary measure of market conditions, I linked clinical trials to the
USP Drug Classification list.
Once I had linked the drugs used in a trial to the applicable USP DC category
and class, I could find the number of alternative brands in that class.
This matching was performed by hand, using a custom web interface to the database.
In order to link clinical trials to standardized ICD-10 conditions and thus
to the Global Burdens of Disease Data, I wrote a python script to search the
UMLS system for ICD-10 codes that matched the MeSH descriptions for
each trial.
This search resulted in generally three categories of search results:
\begin{enumerate}
\item The results contained a few entries, one of which was obviously correct.
\item The results contained a large number of entries, a few of which were correct.
\item The results did not contain any matches.
\end{enumerate}
In these cases I needed a way to validate each match and potentially add my own
ICD-10 codes to each trial.
This matching was also performed by hand, using a separate custom web interface to the database.
The effort to manually match ICD-10 codes and USP DC categories and classes data is ongoing.
%Describe linking icd10 codes to GBD %Describe linking icd10 codes to GBD
% Not every icd10 code maps, so some trials are excluded. % Not every icd10 code maps, so some trials are excluded.
%Describe categorizing icd10 codes %Describe categorizing icd10 codes
@ -348,4 +356,6 @@ To get the best estimate of the size of the population associated with a disease
each trial is linked to the most specific disease category applicable. each trial is linked to the most specific disease category applicable.
As not every ICD-10 code is linked to a condition in the GBD, those without any As not every ICD-10 code is linked to a condition in the GBD, those without any
applicable conditions are dropped from the dataset. applicable conditions are dropped from the dataset.
\end{document} \end{document}

@ -2,81 +2,78 @@
\graphicspath{{\subfix{Assets/img/}}} \graphicspath{{\subfix{Assets/img/}}}
\begin{document} \begin{document}
% % Introduce clinicaltrials.gov
% % - Describe different statuses
% % - status flowchart
% % Introduce causal model
% % - Diagram
% % - List each node and what they influence (and why)
% % Begin Discussing Data
% % - Where did I get data for each node?
%
% When any clinical trial is conducted, it goes through three distinct stages:
% pre-trial, active, and decision to conclude.
% In figure \ref{Fig:Stages}, you can see the component parts of each stage.
%
% \begin{figure}[H] %use [H] to fix the figure here.
% \includegraphics[width=\textwidth]{../assets/img/ClinicalTrialStagesAndStatuses}
% \caption{Model of Statuses}
% \label{Fig:Stages}
% \end{figure}
%
% In the pre-trial stage, the sponsoring organization chooses to run the trial,
% they register the trial on \url{ClinicalTrials.gov}, and then decide if they
% will begin enrollment.
% Many registered trials are withdrawn at this point, before the trial has opened
% for enrollment.
% Once enrollment has opened
Because running randomized experiments on companies running clinical trials
is unlikely to to happen anytime soon,
causal identification will depend on observational methods.
I use the do-calculus approach developed by Judea Pearl
\cite{pearl_CausalityModels_2009}
to describe what affects the success of a Phase III clinical trial.
I then use that model to derive the econometric model capable of estimating
the effect of extending the recruiting period or of having an additional
competing drug.
Because running experiments on companies running clinical trials is not going
to happen anytime soon, causal identification will depend on creating a
structural causal model. % In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes % the data generating model.
the data generating model. The proposed data generating model consists of a decision maker
The proposed data generating model consists of a decision maker, the study -- the study sponsor --
sponsor, who must decide whether to let a trial run to completion or terminate who must decide whether to let a trial run to completion or terminate
the trial early. the trial early.
While receiving updates regarding the status of the trial, they ask questions While receiving updates regarding the status of the trial, they try to
such as: answer questions such as:
\begin{itemize} \begin{itemize}
\item Do I need to terminate the trial due to safety incidents? \item Do I need to terminate the trial due to safety incidents?
\item Does it appear that the drug is effective enough to achieve our \item Does it appear that the drug is effective?
goals, justifying continuing the trial?
\item Are we recruiting enough participants to achive the statistical \item Are we recruiting enough participants to achive the statistical
results we need? results we need?
\item Does the current market conditions and expectations about returns on \item Does the current market conditions and expectations about
returns on
investment justify the expenditures we are making? investment justify the expenditures we are making?
\end{itemize} \end{itemize}
When appropriate, the study sponsor terminates the trial. Althought I treat this as a single agent, in reality, there are multiple
If there are not enough issues to terminate the trial, it continues until it stakeholders involved in chosing whether the trial should continue, including
is completed. those running the trial (which may be a separate firm),
the company developing the drug, additional rightsholders,
While conducting a trial, the safety and efficacy of a drug are driven by or funding organizations.
fundamental pharmacokinetic properties of the compounds.
These are only imperfectly measured both prior to and during any given trial.
Previously measured safety and efficacy inform the decision to start the trial
in the first place while currently observed safety and efficiency results
help the sponsor judge whether or not to continue the trial.
Of course, these decisions are both affected by the specific condition being
treated due to differences in the severity of the symptoms.
When a trial has been started, it comes time to recruit participancts. % When appropriate, the study sponsor terminates the trial.
Participants frequently depend on the advice of their physician when deciding % If there are not enough issues to terminate the trial, it continues until it
to join a trial or not. % is completed.
As these physicians have a duty to seek their patients best interest; they, along
with their patients will evaluate if the previously observed safety and efficacy
results justify joining the trial over using current standard treatments.
Thus the current market conditions may affect the rate at which participants
enroll in the trial.
The enrollment of participants in a trial depends on a few other factors. In the United States, clinical trials are required by law to be registered on
The condition or disease of interest and how it progresses will determine how long \url{ClinicalTrials.gov}, where they are made available to the public.
recruitiment will be held open versus just an observation of treatment arms. Trials must be registered
Aditionally, a trial that has already reached a high enough enrollment will often
close recruitment by switching to an "Active, not recruiting" stage to manage costs.
Finally, enrolling participants depends on how difficult it is to find people
who suffer from the condition of interest.
The preceeding issue of population size also affects the number of alternatives available.
When there are less people affected by the disease, the smaller market reduces
possible profitability, all else equal.
Thus the likelihood of companies paying the sunk costs to develop drugs for
these conditions may be lower.
Finally, the number of alternatives on the market may affect the return on
investment directly, causing a trial to terminate early if the return is
not high enough.
\begin{figure}[H] %use [H] to fix the figure here.
\includegraphics[width=\textwidth]{../assets/img/dagitty-model.jpg}
\caption{Causal Model}
\label{Fig:CausalModel}
\end{figure}
% %
By using Judea Pearl's do-calculus, I can show that by choosing an adjustment
set of the decision to condut a phase III trial, the condition of interest,
the current status of the trial, and the population size will casually
identify the direct effects of enrollment and market alternatives on the
probability of termination.
This is easily verified through the backdoor criterion, which states that
if every path between the exposure and outcome that starts with an arrow
flowing into the exposure is blocked by one of the values in the adjustment
set, then the effect of the exposure on outcome is causally identified
(\cite{pearl_causality_2000}).
It can be easily visually verified by the DAG on the graph that this is the case.
\end{document} \end{document}

@ -47,8 +47,8 @@ The betas are distributed
\end{align} \end{align}
With hyperparameters With hyperparameters
\begin{align} \begin{align}
\mu_k \sim \text{Normal}(0,1) \\ \mu_k \sim \text{Normal}(0,0.05) \\
\sigma_k \sim \text{Gamma}(2,1) \sigma_k \sim \text{Gamma}(4,20)
\end{align} \end{align}

@ -3,80 +3,97 @@
\begin{document} \begin{document}
This paper sits within an intersection of health and industrial organization economics % Outline
that is frequently studied. % - Introduce and frame problem
Encouraging a strong supply of novel and generic pharmaceuticals contributes % - Phases & regulatory part
in important ways to both public health and fiscal policy. % - Large number of failures at each phase
Not only to the pathway to drug approval long, as many as 90\% of compounds % - There are multiple ways to measure this
that begin human trials fail to gain approval % - Estimation of failures at phase and failures per development path
(\cite{khmelnitskaya_competition_2021}). % - Talk about impact of making these closer together
Complicating this is the complex regulatory and competitive environment in % - Trying to develop more by tweaking external world:
which pharmaceutical companies operate. % - Pull incentives
% - Increase in market sizes.
%%%%%%%%% Why are drugs so expensive? % - Uncertanty in Intellectual Property
% - Understanding failure modes
% van der Grond, Uyle-de Groot, Pieters 2017 % - EK and Hwang
% - What causes high costs of drugs? % - discuss missing section of operational concerns
% - High level synthesis of discussion regarding causes % - Introduce metabio
% - Academic and non-academic sources % - Once again bring up my work here.
% -
% -
%%%%%%%%%%%%%%%% What do we know about clinical trials?
% Hwang, Carpenter, Lauffenburger, et al (2016)
% - Why do investigational new drugs fail during late stage trials?
\citeauthor{hwang_failure_2016} (\citeyear{hwang_failure_2016})
investigated causes for which late stage (Phase III)
clinical trials fail across the USA, Europe, Japan, Canada, and Australia.
They found that for late stage trials that did not go on to recieve approval,
57\% failed on efficacy grounds, 17\% failed on safety grounds, and 22\% failed
on commercial or other grounds.
For context, this current work hopes to be able to distinguish some of the
mechanisms behind those commercial or other failures.
\subsection{Drug development process and failure rates}
% Abrantes-Metz, Adams, Metz (2004) % Abrantes-Metz, Adams, Metz (2004)
% - What correlates with successfully passing clinical trials and FDA review? % - What correlates with successfully passing clinical trials and FDA review?
% - % -
In \citeyear{abrantes-metz_pharmaceutical_2004}, \cite{abrantes-metz_pharmaceutical_2004}
\citeauthor{abrantes-metz_pharmaceutical_2004}
described the relationship between described the relationship between
various drug characteristics and how the drug progressed through clinical trials. various drug characteristics and how the drug progressed through clinical trials.
This non-causal estimate was notable for using a This descriptive estimate used a
mixed state proportional hazard model and estimating the impact of mixed state proportional hazard model and estimated the impact of
observed characteristics in each of the three phases. observed characteristics in each of the three phases.
They found that as trials last longer, the rate of failure increases for They found that as trials last longer, the rate of failure increases for
Phase I \& II trials, while Phase 3 trials generally have a higher rate of Phase I and II trials, while Phase 3 trials generally have a higher rate of
success than failure after 91 months. success than failure after 91 months.
% Ekaterina Khmelnitskaya (2021) %DiMasi FeldmanSeckler Wilson 2009
% - separates scientific from market failure of the clinical drug pipeline \cite{dimasi_TrendsRisks_2010} examine the completion rate of clinical drug
In her doctoral dissertation, Ekaterina Khmelnitskaya studied the transition of develompent and find that for the 50 largest drug producers,
drug candidates between clinical trial phases. approximately X\% of their drugs under development
Her key contribution was to find ways to disentangle strategic exits from the \todo{FILL IN X}
development pipeline and exits due to clinical failures. successfully completed the process.
She found that overall 8.4\% of all pipeline exits are due to strategic They note a couple of changes in how drugs are developed over the years they
terminations and that the rate of new drug production would be about 23\% study (clinical development started between 1993 and 2004).
higher if those strategic terminatations were elimintated This included that drugs began to fail earlier in their development cycle in the
(\cite{khmelnitskaya_competition_2021}). latter half of the time they studied.
This may be an operational change to reduce the cost of new drugs.
\cite{dimasi_ValueImproving_2002}
used data on 68 investigational drugs from 10 firms to simulate how reducing
time in development reduces the costs of developing drugs.
He estimates that reducing Phase III of clinical trials by one year would
reduce total costs by about 8.9\% and that moving 5\% of clinical trial failures
from phase III to Phase II would reduce out of pocket costs by 5.6\%.
% Waring, Arrosmith, Leach, et al (2015) % Waring, Arrosmith, Leach, et al (2015)
% - Atrition of drug candidates from four major pharma companies % - Atrition of drug candidates from four major pharma companies
% - Looked at how phisicochemical properties affected clinical failure due to safety issues % - Looked at how phisicochemical properties affected clinical failure due to safety issues
%not in this version % Don't think this is applicable.
\subsection{Market incentives and drug development}
%%%%%%%%% What do we know about drug development incentives?
\subsection{What do we know about drug development incentives?}
% Introduce section
% - Dranov et al 2022 - demand pull seems to bias follow up drug development.
% - increasing demand doesn't necessarily result in new compounds (check this). Risks.
\cite{dranove_DoesConsumer_2022} examined whether increased demand for drugs
will increase the development of novel drugs.
Using measures of the scientific novelty of drug compounds after the creation
of Medicare part D, they found that most development occurred in the least
novel categories of drugs, in spite of a relatively constant growth in novel
compounds.
\cite{dranove_DoesConsumer_2022} use the implementation of Medicare part D
to examine whether the production of novel or follow up drugs increases during
the following 15 years.
They find that when Medicare part D was implemented -- increasing senior
citizens' ability to pay for drugs -- there was a (delayed) increase
in drug development, with effects concentrated among compounds that were least
innovative according to their classification of innovations.
They suggest that this is due to financial risk management, as novel
pharmaceuticals have a higher probability of failure compared to the less novel
follow up development.
This is what leads risk-adverse companies to prefer follow up development.
%%%%%%%%% What do we know about drug development incentives?
% Dranov, Garthwaite, and Hermosilla (2022)
% - does the demand-pull theory of R&D explain novel compound development?
% - no, it is biased towards follow-on drug R&D.
% Acemoglu and Linn % - acemoglu and linn 2004 - population size matters.
% - Market size in innovation % - Population ties into the number of drugs available, and operational (recruitment) concerns
% - In general, there are going to be many confounding variables.
% -
% - Exogenous demographic trends has a large impact on the entry of non-generic drugs and new molecular entitites. % - Exogenous demographic trends has a large impact on the entry of non-generic drugs and new molecular entitites.
On the side of market analysis, %TODO:remove when other sections are written up. On the side of market analysis,
\citeauthor{acemoglu_market_2004} \citeauthor{acemoglu_market_2004}
(\citeyear{acemoglu_market_2004}) (\citeyear{acemoglu_market_2004})
used exogenous deomographics changes to show that the used exogenous deomographics changes to show that the
@ -86,12 +103,175 @@ entry of new drugs by 6\%, mostly concentrated among generics.
Among non-generics, a 1\% increase in potential market size Among non-generics, a 1\% increase in potential market size
(as measured by demographic groups) leads to a 4\% increase in novel therapies. (as measured by demographic groups) leads to a 4\% increase in novel therapies.
% Cerda 2007 - Endogenous innovations in the pharmaceutical industry
% from abstract %TODO: Read better
% Market size, population, and existence of drugs are endogenous
% from the abstract I get the impresssion that it is:
% - large population -> large market -> more profitable -> more drugs
% - more drugs -> better survivability -> larger market
% Applicable because: Need to separate population and market effects.
% Does this mess with my results? I don't think so because of the relatively short time in trials. Not enough time to effect population back, but it might have another effect.
\cite{cerda_EndogenousInnovations_2007}
suggests a two-way, long term relationship between market size and drug
development.
They suggest that a large population with a condition implies a (relatively)
larger market, which improves the profitabilty and thus number of drugs with that
condition.
Then the drugs improve mortality, increasing the relative population.
They do find evidence of the impact of both population and market size
on the creation of new drugs.
% van der gronde et al 2017 Addressing the challenge of high-price prescription drugs
% Massive number of policies used to try to reduce costs. These will affect production decisions.
% Some of the unintended consequences of that (in terms of reduced development incentives) include
% - reducing development costs - side effect of lower quality evidence
% - Preference policy (e.g. policies about using generics first etc) - side effect of shorter life cycle for patented (novel) drugs.
% - these are focused on reducing expenditures, i.e. they reduce profit. Some of them feed back into the development process.
\cite{vandergronde_AddressingChallenge_2017}
documents many of the things driving drug development choices.
\begin{itemize}
\item Policies that encourage low cost generics shorten the life cycle of
patented/novel drugs.
\item Some diseases have lower safety and efficacy standards applied to them
compared to similar diseases. These tend to have higher R\&D due to the
lower costs involved.
\item As much of the "low hanging fruit" in drug development has been developed,
R\&D expenses have been increasing.
\end{itemize}
% Dubois et al 2015 - Market Size and pharmaceutical innovation
% estimate the relationship between marekt size and the innovation in pharmaceuticals
% elasticity of innovation w.r.t. expected market size of 0.23, thus $2.5 billion in
% market size required to get a new chemical entity.
\cite{dubois_MarketSize_2015}
examined the ``elasticity of innovation'', i.e. the ``additional revenue required
to support the invention of a new chemical entity.''
They found that a marginal drug will require approximately a \$2.5 billon increase
in expected revenue.
% Gupta % Gupta
% - Inperfect intellectual property rights in the pharmaceutical industry % - Inperfect intellectual property rights in the pharmaceutical industry
%\cite{GupaPhd2023} \cite{gupta_OneProduct_2020} discovered that uncertainty around which patents
might apply to a novel drug causes a delay in the entry of generics after
the primary patent has expired.
She found that this delay in delivery is around 3 years.
\subsection{What do we know about how Clinical Trials operations?}
%interview with Adam George
% - clinical trials are often handled by contractors
% - they plan sites, start times, etc from beginning.
% - Running late is normal.
In a personal interview with someone who works for a company that runs clinical
trials, I learned about how clinical trials will typically proceed.
\todo{Figure out best way to cite this}
\begin{itemize}
\item Quote a job (one side of company): N, timeline, etc
\item Allocate resources (sites, doctors, etc) to try to accomplish
\item Sales vs Operations conflict, leading to lateness/issues delivering, etc.
\end{itemize}
% Bess Stillman - look at difficulties joining oncology trials
% Random sample of Clinicaltrials.gov - how many closed due to operational problems?
% TODO: random sample 171, about 30% mentioned recruitment issues
% Results on enrollment projection
% - nothing really good exists.
% - Multiple models, no comparison.
% - no cross validation, only tested on a few trials.
% Thus we should look at the effects that operational concerns have.
\subsection{Understanding Failures in Drug Development}
% DISCUSS: Different types of failures
There are myriad of reasons that a drug candidate may not make it to market,
regardless of it's novelty or known safety.
In this work, I focus on the failure of individual clinical trials, but the
categories of failure apply to the individual trials as well as the entire
drug development pipeline.
They generally fall into one of the following categories:
\begin{itemize}
\item Scientific Failure: When there are issues regarding
safety and efficacy that must be addressed.
The preeminient question is:
``Will the drug work for patients?''
%E.Khm, Gupta, etc.
\item Strategic Failure: When the sponsors stop development because of
profitability
%Whether or not the drug will be profitiable, or align with
%the drug developer's future Research \& Development directions i.e.
``Will producing the drug be beneficial to the
company in the long term?''
%E.Khm, Gupta, GLP-1s, etc.
\item Operational concerns are answers to:
%Whether or not the developer can successfully conduct
%operations to meet scientific or strategic goals, i.e.
``What has prevented the the company from being able to
finance, develop, produce, and market the drug?''
\end{itemize}
It is likely that a drug fails to complete the development cycle due to some
combination of these factors.
%USE MetaBio/CalBio GLP-1 story to illuistrate these different factors.
\cite{flier_DrugDevelopment_2024} documents the case of MetaBio, a company
he was involved in founding that was in the first stages of
developing a GLP-1 based drug for diabetes or obesety before being shut down
in .
MetaBio was a wholy owned subsidiary of CalBio, a metabolic drug development
firm, that recieved a \$30 million -- 5 year investment from Pfizer to
persue development of GLP-1 based therapies.
At the time it was shut down, it faced a few challenges:
\begin{itemize}
\item The compound had a short half life and they were seeking methods to
improve it's effectiveness; a scientific failure.
\item Pfizer imposed a requirement that it be delivered though a route
other than injection (the known delivery mechanism); a strategic failure.
\item When Pfizer pulled the plug, CalBio closed MetaBio because they
could not find other funding sources; an operational failure.
\end{itemize}
The author states in his conclusion:
\begin{displayquote}
Despite every possibility of success,
MetaBio went down because there were mistaken ideas about what was
possible and what was not in the realm of metabolic therapeutics, and
because proper corporate structure and adequate capital are always
issues when attempting to survive predictable setbacks.
\end{displayquote}
From this we see that there was a cascade of issues leading to the failure to
develop this novel drug.
% NOW discuss efforts to measure the impact of different aspects
\citeauthor{hwang_failure_2016} (\citeyear{hwang_failure_2016})
investigated causes for which late stage (Phase III)
clinical trials fail across the USA, Europe, Japan, Canada, and Australia.
They found that for late stage trials that did not go on to recieve approval,
57\% failed on efficacy grounds, 17\% failed on safety grounds, and 22\% failed
on commercial or other grounds.
In her doctoral dissertation, Ekaterina Khmelnitskaya studied the transition of
drug candidates between clinical trial phases.
Her key contribution was to find ways to disentangle strategic exits from the
development pipeline and exits due to clinical failures.
She found that overall 8.4\% of all pipeline exits are due to strategic
terminations and that the rate of new drug production would be about 23\%
higher if those strategic terminatations were elimintated
(\cite{khmelnitskaya_competition_2021}).
% causal separation of strategic exits etc.
% I don't think I need to include modelling enrollment here.
% If it is applicable, it can show up in those sections later.
% Agarwal and Gaule 2022
% - Retrospective on impact from COVID-19 pandemic
% Not in this version
\end{document} \end{document}

@ -19,97 +19,101 @@ written or requires reparameterization.
%TODO: and info about how I learned about these diagnostics %TODO: and info about how I learned about these diagnostics
\subsubsection{Diagnostics} % \subsubsection{Diagnostics}
%Examine trank plots % %Examine trank plots
To identify which parameters were problematic, I first looked at trace rank % To identify which parameters were problematic, I first looked at trace rank
histograms. % histograms.
Under idea circumstances, each line (representing a chain) should exchange % Under idea circumstances, each line (representing a chain) should exchange
places with the other lines frequently. % places with the other lines frequently.
In both \cref{fig:mu_trank} and \cref{fig:sigma_trank}, most parameters seem % In both \cref{fig:mu_trank} and \cref{fig:sigma_trank}, most parameters seem
to mix well but there are a couple of exceptions. % to mix well but there are a couple of exceptions.
This warrants further investigation. % This warrants further investigation.
%
% \begin{figure}[H]
% \includegraphics[width=\textwidth]{../assets/img/mu_trank.png}
% \caption{Trace Rank Histogram: Mu values}
% \label{fig:mu_trank}
% \end{figure}
%
% \begin{figure}[H]
% \includegraphics[width=\textwidth]{../assets/img/sigma_trank.png}
% \caption{Trace Rank Histogram: Sigma values}
% \label{fig:sigma_trank}
% \end{figure}
%
% %Take a look at batman and points for mu
% In the case of the Mu values, a parallel coordinates plot
% doesn't seem to indicate any parameters as likely candidates
% for causing the issues with divergent transitions.
% \begin{figure}[H]
% \includegraphics[width=\textwidth]{../assets/img/mu_batman.png}
% \caption{Parallel Coordinate Plot: Mu values}
% \label{fig:mu_batman}
% \end{figure}
% Note that at each parameter, there is some level of dispersion between
% values that diverged.
%
% On the other hand, in the parallel coordinates plot for sigma values,
% it appears that most divergent transitions occur with values of
% sigma[1], sigma[3], sigma[6], and sigma[7] close to zero.
% \begin{figure}[H]
% \includegraphics[width=\textwidth]{../assets/img/sigma_batman.png}
% \caption{Parallel Coordinate Plot: Sigma values}
% \label{fig:sigma_batman}
% \end{figure}
% Overall this suggests that there is an issue with the specification
% of the covariance structures of the hyperparameters.
%
% Additional evidence that the covariance structure is incorrect comes from
% plotting pairs of parameter values and examining the chains with divergent
% transitions.
%
% \begin{figure}[H]
% \includegraphics[width=\textwidth]{../assets/img/sigma_pairs_5-9.png}
% \caption{Parameter Pairs plots: Sigma[5] through Sigma[9]}
% \label{fig:sigma_pairs_5-9.png}
% \end{figure}
% From this we can see that divergent pairs are highly correlated with the cases
% where sigma[6] or sigma[7] are equal to zero.
% This has an impact on the shape of both of those estimated parameters, causing
% both to be bimodal.
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/mu_trank.png}
\caption{Trace Rank Histogram: Mu values}
\label{fig:mu_trank}
\end{figure}
\begin{figure}[H] \subsection{Interpretation}
\includegraphics[width=\textwidth]{../assets/img/sigma_trank.png}
\caption{Trace Rank Histogram: Sigma values}
\label{fig:sigma_trank}
\end{figure}
%Take a look at batman and points for mu The key results so far are related to the distribution of differences in $p$.
In the case of the Mu values, a parallel coordinates plot
doesn't seem to indicate any parameters as likely candidates In figure \ref{fig:pred_dist_dif_delay} we see that there while most trials do not see any increased risk
for causing the issues with divergent transitions. from a delay in closing enrollment, there is a small group that does experience this.
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/mu_batman.png}
\caption{Parallel Coordinate Plot: Mu values}
\label{fig:mu_batman}
\end{figure}
Note that at each parameter, there is some level of dispersion between
values that diverged.
On the other hand, in the parallel coordinates plot for sigma values,
it appears that most divergent transitions occur with values of
sigma[1], sigma[3], sigma[6], and sigma[7] close to zero.
\begin{figure}[H] \begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/sigma_batman.png} \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
\caption{Parallel Coordinate Plot: Sigma values} \caption{}
\label{fig:sigma_batman} \label{fig:pred_dist_diff_delay}
\end{figure} \end{figure}
Overall this suggests that there is an issue with the specification
of the covariance structures of the hyperparameters.
Additional evidence that the covariance structure is incorrect comes from
plotting pairs of parameter values and examining the chains with divergent
transitions.
Figure \ref{fig:pred_dist_dif_delay2} shows how this varies across disease categories
\begin{figure}[H] \begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/sigma_pairs_5-9.png} \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay-group}
\caption{Parameter Pairs plots: Sigma[5] through Sigma[9]} \caption{}
\label{fig:sigma_pairs_5-9.png} \label{fig:pred_dist_dif_delay2}
\end{figure} \end{figure}
From this we can see that divergent pairs are highly correlated with the cases
where sigma[6] or sigma[7] are equal to zero.
This has an impact on the shape of both of those estimated parameters, causing
both to be bimodal.
\subsection{Interpretation}
Ignoring the diagnosed issues with the model, we do see some interesting
preliminary results.
%in mu, mu[5] shifted strongly
In \cref{fig:mu_posterior} we see that mu[5], the parameter corresponding
to enrollment appears to be strongly negative.
This is consistent with the idea that enrollment close to planned enrollment
decreases the probability of terminating the trial.
In \cref{fig:sigma_posterior}, sigma[2] (corresponding to the number of brands
selling the drug of interest) has a large variance covers some relatively
high values.
This suggests that the impact of how frequently the drug is sold varies greatly
across different ICD-10 categories of disease.
We can also examine the direct effect from adding a single generic competitior drug.
\begin{figure}[H] \begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/mu_posterior.png} \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-generic}
\caption{Posterior Parameter Estimates: Mu} \caption{}
\label{fig:mu_posterior} \label{fig:pred_dist_diff_generic}
\end{figure} \end{figure}
% Sigma[2] suggests there is a high variance in the impact that the number of drugs on the market has. Figure \ref{fig:pred_dist_dif_generic2} shows how this varies across disease categories
\begin{figure}[H] \begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/sigma_posterior.png} \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-generic-group}
\caption{Posterior Hyperparameter Estimates: Sigma} \caption{}
\label{fig:sigma_posterior} \label{fig:pred_dist_dif_generic2}
\end{figure} \end{figure}
Due to the deficiencies in the data and model, this is the limit of the
analysis I will perform at this time.
\end{document} \end{document}

@ -0,0 +1,384 @@
\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}
\begin{document}
% Begin by talking about goal, what does it mean? This might need some work prior to give more background.
As I am trying to separate strategic concerns
(the effect of a marginal treatment methodology)
and an operational concern
(the effect of a delay in closing enrollment),
we need to look at what confounds these effects and how we might measure them.
The primary effects one might expect to see are that
\begin{enumerate}
\item Adding more drugs to the market will make it harder to
finish a trial as it is
more likely to be terminated due to concerns about profitabilty.
\item Adding more drugs will make it harder to recruit, slowing enrollment.
\item Enrollment challenges increase the likelihood that a trial will
terminate.
% Mentioned below
% \item A large population/market will tends to have more drugs to treat it
% because it is more profitable.
% \item A large population/market will make it easier to recruit,
% reducing the likelihood of a termination due to enrollment failure.
\end{enumerate}
There are a few fundamental issues that arise when trying to estimate
these effects.
The first is that the severity of the disease and the size of the population
who has that disease affects the ease of enrolling participants.
For example, a large population may make it easier to find enough participants
to achieve the required statistical discrimination between
control and treatment.
Second, for some diseases there exists an endogenous dynamic
between the treatments available for a disease and the
market size/population with that disease.
\authorcite{cerda_EndogenousInnovations_2007} proposes two mechanisms
that link the drugs on the market and market size.
The inverse is that for many chronic diseases with high mortality rates,
more drugs cause better survivability, increasing the size of those markets.
The third major confound is that the drugs on the market affect enrollment.
If there is a treatment already on the market, patients or their doctors
may be less inclined to participate in the trial, even if the current treatment
has severe downsides.
There are additional problems.
One is in that the disease being treated affects the
safety and efficacy standards that the drug will be held too.
For example, if a particular cancer is very deadly and does not respond well
to current treatments, Phase I trials will enroll patients with that cancer,
as opposed to the standard of enrolling healthy volunteers
\cite{commissioner_DrugDevelopment_2020} to establish safe dosages.
The trial is more likely to be terminated early if the drug is unsafe or has no
discernabile effect, therefore termination depends in part on a compound-disease
interaction.
Another challenge comes from the interaction between duration and termination;
in that if a trial terminates before closing enrollment for issues other
than enrollment, then the enrollment will still be low.
On the other hand, if enrollment is low, the trial might terminate.
These outcomes are indistinguishable in the data provided by the final
\url{ClinicalTrials.gov} dataset.
Finally, while conducting a trial, the safety and efficacy of a drug are driven by
fundamental pharmacokinetic properties of the compounds.
These are only imperfectly measured both prior to and during any given trial.
Previously measured safety and efficacy inform the decision to start the trial
in the first place while currently observed safety and efficiency results
help the sponsor judge whether or not to continue the trial.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Clinical Trials Data Sources}
%% Describe data here
Since Sep 27th, 2007 those who conduct clinical trials of FDA controlled
drugs or devices on human subjects must register
their trial at \url{ClinicalTrials.gov}
(\cite{noauthor_fdaaa_nodate}).
This involves submitting information on the expected enrollment and duration of
trials, drugs or devices that will be used, treatment protocols and study arms,
as well as contact information the trial sponsor and treatment sites.
When starting a new trial, the required information must be submitted
``\dots not later than 21 calendar days after enrolling the first human subject\dots''.
After the initial submission, the data is briefly reviewed for quality and
then the trial record is published and the trial is assigned a
National Clinical Trial (NCT) identifier.
\cite{noauthor_fdaaa_nodate}.
Each trial's record is updated periodically, including a final update that must occur
within a year of completing the primary objective, although exceptions are
available for trials related to drug approvals or for trials with secondary
objectives that require further observation\footnote{This rule came into effect in 2017}
\cite{noauthor_fdaaa_nodate}.
Other than the requirements for the the first and last submissions, all other
updates occur at the discresion of the trial sponsor.
Because the ClinicalTrials.gov website serves as a central point of information
on which trials are active or recruting for a given condition or drug,
most trials are updated multiple times during their progression.
There are two primary ways to access data about clinical trials.
The first is to search individual trials on ClinicalTrials.gov with a web browser.
This web portal shows the current information about the trial and provides
access to snapshots of previously submitted information.
Together, these features fulfill most of the needs of those seeking
to join a clinical trial.
For this project I've been able to scrape these historical records to establish
snapshots of the records provided.
%include screenshots?
The second way to access the data is through a normalized database setup by
the
\href{https://aact.ctti-clinicaltrials.org/}{Clinical Trials Transformation Initiative}
called AACT. %TODO: Get CITATION
The AACT database is available as a PostgreSQL database dump or set of
flat-files.
These dumps match a near-current version of the ClinicalTrials.gov database.
This format is ameniable to large scale analysis, but does not contain
information about the past state of trials.
I combined these two sources, using the AACT dataset to select
trials of interest and then scraping \url{ClinicalTrials.gov} to get
a timeline of each trial.
%%%%%%%%%%%%%%%%%%%%%%%% Model Outline
The way I use this data is to predict the final status of the trial
from the snapshots that were taken, in effect asking:
``how does the probability of a termination change from the current state
of the trial if X changes?''
%% Return to causal identification
\subsection{Causal Identification}
Because running experiments on companies running clinical trials is not going
to happen anytime soon, causal identification depends on using a
structural causal model.
Because the data generating process for the clinical trials records is rather
straightforward, this is an ideal place to use
\authorcite{pearl_causality_2000}
Do-Calculus.
This process involves describing the data generating process in the form of
a directed acyclic graph, where the nodes represent different variables
within the causal model and the directed edges (arrows) represent
assumptions about which variables influence the other variables.
There are a few algorithms that then tell the researcher which of the
relationships will be confounded, which ones can be statistically estimated,
and provides some hypotheses that can be tested to ensure the model is
reasonably correct.
In \cref{Fig:CausalModel} I diagram the directed acyclic graph that describes
my proposed data generating process,
It revolves around the decisions made by the study sponsor,
who must decide whether to let a trial run to completion
or terminate the trial early.
While receiving updates regarding the status of the trial, they ask questions
such as:
\begin{itemize}
\item Do I need to terminate the trial due to safety incidents?
\item Does it appear that the drug is effective enough to achieve our
goals, justifying continuing the trial?
\item Are we recruiting enough participants to achive the statistical
results we need in the budget we have?
\item Does the current market conditions and expectations about returns on
investment justify the expenditures we are making?
\end{itemize}
When appropriate issues arise, the study sponsor terminates the trial, otherwise
it continues to completion.
\begin{figure}[H] %use [H] to fix the figure here.
\frame{
\scalebox{0.65}{
\tikzfig{../assets/tikzit/CausalGraph2}
}
}
\todo{check if this is the correct graph}
\caption{Graphical Causal Model}
% \small{Crimson boxes are the variables of interest,
% white boxes are unobserved, while the gray boxes will be controlled for.}
\label{Fig:CausalModel}
\end{figure}
% Constructing the model more explicitly
% - quickly describe each node and line.
\todo{I think I need to blend the data section in before this, to give some overall information on data.}
\todo{I may need to add some information on snapshots so that this makes sense.}
A quick summary of the nodes of the DAG, the exact representation in the data, and their impact:
\begin{itemize}
\item Main Interests (Crimson Boxes)
\begin{enumerate}
\item \texttt{Will Terminate?}:
If the final status of the trial was \textit{terminated}
and comes from the AACT dataset.
or \textit{completed}.
\item \texttt{Enrollment Status}:
This describes the current enrollment status of the snapshot, e.g.
\texttt{Recruiting},
\texttt{Enrolling by invitation only},
or
\texttt{Active, not recruting}.
\item \texttt{Market Measures}:
Various measures of the number of alternate drugs on the market.
These are either the number of other drugs with the same active ingredient as the trial
(both generic and originators),
and those considered alternatives in various formularies published by the United States Pharmacopeia.
\end{enumerate}
\item Observed Confounders (Gray Boxes)
\begin{enumerate}
\item \texttt{Condition}:
The underlying condition, classified by IDC-10 group.
This impacts every other aspect of the model and is pulled from
the AACT dataset.
\item \texttt{Population (market size)}:
Multiple measures of the impact the disease.
These are measured by the DALY cost of the disease, and is
separated by the impact on countries with
High, High-Medium, Medium, Medium-Low, and Low
development scores.
This data comes from the Institute for Health Metrics' Global Burden of Disease study.
\item \texttt{Elapsed Duration}:
A normalized measure of the time elapsed in the trial.
Comes from the original estimate of the trial's primary completion date and the registered start date.
I take the difference in days between these, and get the percentage of that time that has elapsed.
This calculation is based on data from the snapshots and the
AACT final results.
\item \texttt{Decision to Proceed with Phase III}:
If the compound development has progressed to Phase III.
This is included in the analysis by only including
Phase III trials registered in the AACT dataset.
\end{enumerate}
\item Unobserved Confounders (White Boxes)
\begin{enumerate}
\item \texttt{Fundamental Efficacy and Safety}:
The underlying safety of the compound.
Cannot be observed, only estimated through scientific study.
\item \texttt{Previously observed Efficacy and Safety}:
The information gathered in previous studies.
This is not available in my dataset because I don't
have links to prior studies.
\item \texttt{Currently observed Efficiency and Safety}:
The information gathered during this study.
This is only partially available, and so is
treated as unavailable.
After a study is over, the investigators are
often publish information about adverse events, but only
those that meet a certain threshold.
As this information doesn't appear to be provided to
participants, we don't consider it.
\end{enumerate}
\end{itemize}
%
\begin{itemize}
\item Relationships of interest
\begin{enumerate}
\item \texttt{Enrollment Status} $\rightarrow$ \texttt{Will Terminate?}:
This is the primary effect of interest.
\item \texttt{Market Measures} $\rightarrow$ \texttt{Will Terminate?}:
This is the secondary effect of interest.
\end{enumerate}
\item Confounding Pathways
\begin{enumerate}
\item
\texttt{Condition}:
Affects every other node.
Part of the Adjustment Set.
\item Backdoor Pathway
between \texttt{Will Terminate?} and
\texttt{Enrollment Status} through safety and efficiency.
The concern is that since previously learned information
and current information are driven by the same underlying
physical reality, the enrollment process and
termination decisions may be correlated.
Controlling for the decision to proceed with the trial is the
best adjustment available to block this confounding pathway.
Below I describe the exact pathways.
\begin{enumerate}
\item
\texttt{Fundamental Efficacy and Safety}
$\rightarrow$
\texttt{Currently Observed Efficacy and Safety}:
This relationship represents the measurements of
safety and efficacy in the current trial.
\item
\texttt{Currently Observed Efficacy and Safety}:
$\rightarrow$
\texttt{Will Terminate?}:
This is how the measurements of safety and efficacy in the
current trial affect the probability of termination.
% typically, evidence of a lack safety or efficacy is
% enought to terminate the trial.
\item \texttt{Fundamental Efficacy and Safety}
$\rightarrow$
\texttt{Previously Observed Efficacy and Safety}:
This relationship represents the measurements of
safety and efficacy in work prior to the current trial.
\item
\texttt{Previously Observed Efficacy and Safety}:
$\rightarrow$
\texttt{Decision to proceed with Phase III}:
Previously observed data is essential to the FDA's
decision to allow a phase III trial.
\end{enumerate}
\item
Backdoor Pathway from \texttt{Market Status}
to \texttt{Enrollment}
through \texttt{Population}.
The concern with this pathway is that the rate of enrollment, and
thus the enrollment status, is affected by the Population with
the disease.
Additionally, there is a concern that the number of competitors
is driven by the total market size.
Thus adding Population to the adjustment set is necessary.
\begin{enumerate}
\item
\texttt{Population}
$\rightarrow$
\texttt{Enrollment Status}:
This is fairly straightforward.
How easy it is to enroll participants depends in part
on how many people have the disease.
\item
\texttt{Population}
$\rightarrow$
\texttt{Market Measures}:
This assumes that the population effect flows only one
direction, i.e. that a large population size increases
the likelihood of a large number of drugs.
%TODO: Think about this one a bit because it does mess
% with identification, particularly of market effects.
% these two are jointly determined per cerda 2007.
% If I can't justify separating them, then I'll need to
% merge population (market size) and market measures (drugs on market).
\end{enumerate}
\item
\texttt{Market Measures}
$\rightarrow$
\texttt{Enrollment Status}:
This confounds the estimation of the effect of
\texttt{Enrollment} on \texttt{Will Terminate?}, and
so \texttt{Market Measures} is part of the adjustment set.
\item
\texttt{Market Measures}
$\rightarrow$
\texttt{Decision to proceed with Phase III}:
The alternative treatments on the market will affect a sponsors'
decision to move forward with a Phase III trial.
This is controlled for by only working with trials that
successfully begin recruitment for a Phase III Trial.
\item
\texttt{Elapsed Duration}
$\rightarrow$
\texttt{Will Terminate?}:
The amount of time past helps drive the decision to continue
or terminate.
\item
\texttt{Enrollment Status}
$\leftrightarrow$
\texttt{Elapsed Duration}:
% This is jointly determined. and the weakest part of the causal identification without an accurate model of enrollment.
This is one of the weakest parts of the causal inference.
Without a well defined model of enrollment, we can't separate
the interaction between the enrollment status and the elapsed
duration.
For example, if enrollment is running slower than expected,
the trial may be terminated due to concerns that it will not
achive the primary objectives or that costs will exceed
the budget allocated to the project.
\item
\texttt{Decision to Proceed with Phase III}
$\rightarrow$
\texttt{Will Terminate?}:
%obviously required. Maybe remove from listing and graph?
This effect is fairly straightforward, in that
there is no possibility of a termination or completion
if the trial does not start.
This is here to block a backdoor pathway between
\texttt{Will Terminate?} and the enrollment status
through \texttt{Previously observed Safety and Efficacy}.
\end{enumerate}
\end{itemize}
\end{document}

@ -0,0 +1,318 @@
\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}
\begin{document}
In 1938 President Franklin D Rosevelt signed the Food, Drug, and Cosmetic Act,
granting the Food and Drug Administration (FDA) authority to require
pre-market approval of pharmaceuticals.
\cite{commissioner_MilestonesUS_2023}.
As of Sept 2022 \todo{Check Date} they have approved 6,602 currently-marketed
compounds with Structured Product Labels (SPLs)
and 10,983 previously-marketed SPLs
\cite{commissioner_NSDE_2024}.
%from nsde table. Get number of unique application_nubmers_or_citations with most recent end date as null.
In 1999, they began requiring that drug developers register and
publish clinical trials on \url{https://clinicaltrials.gov}.
This provides a public mechanism where clinical trial sponsors are
responsible to explain what they are trying to acheive and how it will be
measured, as well as provide the public the ability to search and find trials
that they might enroll in.
Multiple derived datasets such as the Cortellis Investigational Drugs dataset
or the AACT dataset from the Clinical Trials Transformation Intiative
integrate these data.
This brings up a question:
Can we use this public data on clinical trials to identify what effects the
success or failure of trials?
In this work, I use updates to records on
\url{https://ClinicalTrials.gov}
to do exactly that, disentangle the effect of participant enrollment
and competing drugs on the market affect the success or failure of
clinical trials.
%Describe how clinical trials fit into the drug development landscape and how they proceed
Clinical trials are a required part of drug development.
Not only does the FDA require that a series of clinical trials demonstrate sufficient safety and efficacy of
a novel pharmaceutical compound or device, producers of derivative medicines may be required to ensure that
their generic small molecule compound -- such as ibuprofen or levothyroxine -- matches the
performance of the originiator drug if delivery or dosage is changed.
For large molecule generics (termed biosimilars) such as Adalimumab
(Brand name Humira, with biosimilars Abrilada, Amjevita, Cyltezo, Hadlima, Hulio,
Hyrimoz, Idacio, Simlandi, Yuflyma, and Yusimry),
the biosimilars are required to prove they have similar efficacy and safety to the
reference drug.
When registering these clinical trials
% discuss how these are registered and what data is published.
% Include image and discuss stages
% Discuss challenges faced
% Introduce my work
In the world of drug development, these trials are classified into different
phases of development.
\cite{FDADrugApprovalProcess_2022}
provide an overview of this process
\cite{commissioner_DrugDevelopment_2020}
while describes the actual details.
Pre-clinical studies primarily establish toxicity and potential dosing levels
\cite{commissioner_DrugDevelopment_2020}.
Phase I trials are the first attempt to evaluate safety and efficacy in humans.
Participants typically are heathy individuals, and they measure how the drug
affects healthy bodies, potential side effects, and adjust dosing levels.
Sample sizes are often less than 100 participants.
\cite{commissioner_DrugDevelopment_2020}.
Phase II trials typically involve a few hundred participants and is where
investigators will dial in dosing, research methods, and safety.
\cite{commissioner_DrugDevelopment_2020}.
A Phase III trial is the final trial befor approval by the FDA, and is where
the investigator must demonstrate safety and efficacy with a large number of
participants, usually on the order of hundreds or thousands.
\cite{commissioner_DrugDevelopment_2020}.
Occassionally, a trial will be a multiphase trial, covering aspects of either
Phases I and II or Phases II and III.
After a successful Phase III trial, the sponsor will decide whether or not
to submit an application for approval from the FDA.
Before filing this application, the developer must have completed
"two large, controlled clinical trials."
\cite{commissioner_DrugDevelopment_2020}.
Phase IV trials are used after the drug has recieved marketing approval to
validate safety and efficacy in the general populace.
Throughout this whole process, the FDA is available to assist in decisionmaking
regarding topics such as study design, document review, and whether or not
they should terminate the trial.
The FDA also reserves the right to place a hold on the clinical trial for
safety or other operational concerns, although this is rare.
\cite{commissioner_DrugDevelopment_2020}.
In the economics literature, most of the focus has been on evaluating how
drug candidates transition between different phases and their probability
of final approval.
% Lead into lit review
% Abrantes-Metz, Adams, Metz (2004)
\cite{abrantes-metz_pharmaceutical_2004},
described the relationship between
various drug characteristics and how the drug progressed through clinical trials.
% This descriptive estimate was notable for using a
% mixed state proportional hazard model and estimating the impact of
% observed characteristics in each of the three phases.
They found that as Phase I and II trials last longer,
the rate of failure increases.
In contrast, Phase 3 trials generally have a higher rate of
success than failure after 91 months.
This may be due to the fact that the purpose of Phases I and II are different
from the purpose of Phase III.
Continuing on this theme,
%DiMasi FeldmanSeckler Wilson 2009
\cite{dimasi_TrendsRisks_2010} examine the completion rate of clinical drug
develompent and find that for the 50 largest drug producers,
approximately 19\% of their drugs under development between 1993 and 2004
successfully moved from Phase I to recieving an New Drug Application (NDA)
or Biologics License Application (BLA).
They note a couple of changes in how drugs are developed over the years they
study, most notably that
drugs began to fail earlier in their development cycle in the
latter half of the time they studied.
They note that this may reduce the cost of new drugs by eliminating late
and costly failures in the development pipeline.
Earlier work by
\authorcite{dimasi_ValueImproving_2002}
used data on 68 investigational drugs from 10 firms to simulate how reducing
time in development reduces the costs of developing drugs.
He estimates that reducing Phase III of clinical trials by one year would
reduce total costs by about 8.9\% and that moving 5\% of clinical trial failures
from phase III to Phase II would reduce out of pocket costs by 5.6\%.
Like much of the work in this field, the focus of the the work by
\citeauthor{dimasi_ValueImproving_2002}
and
\citeauthor{dimasi_TrendsRisks_2010}
tends to be on the drug development pipeline, i.e. the progression between
phases and towards marketing approval.
A key contribution to this drug development literature is the work by
\authorcite{khmelnitskaya_CompetitionAttrition_2021}
on a causal identification strategy
to disentangle strategic exits from exits due to clinical failures
in the drug development pipeline.
She found that overall 8.4\% of all pipeline exits are due to strategic
terminations and that the rate of new drug production would be about 23\%
higher if those strategic terminatations were elimintated.
The work that is closest to mine is the work by
\authorcite{hwang_FailureInvestigational_2016}
who investigated causes for which late stage (Phase III)
clinical trials fail -- with a focus on trials in the USA,
Europe, Japan, Canada, and Australia.
They identified 640 novel therapies and then studied each therapy's
development history, as outlined in commercial datasets.
They found that for late stage trials that did not go on to recieve approval,
57\% failed on efficacy grounds, 17\% failed on safety grounds, and 22\% failed
on commercial or other grounds.
% Begin Discussing what I do. Then introduce
Unlike the majority of the literature, I focus on the progress of
individual clinical trials, not on the drug development pipeline.
In both
\authorcite{khmelnitskaya_CompetitionAttrition_2021}
and
\authorcite{hwang_FailureInvestigational_2016}
the authors describe failures due to safety, efficacy, or strategic concerns.
There is another category of concerns that arise for individual clinical trials,
that of operational failures.
Operational failures can arise when a trial struggles to recruit participants,
the principle investigator or other key member leaves for another opportunity,
or other studies prove that the trial requires a protocol change.
% In a personal review of 199 randomly selected clinical trials from the AACT
% database, the
% \begin{table}
% \caption{}\label{tab:}
% \begin{center}
% \begin{tabular}[c]{|l|l|}
% \hline
% Reason & Percentage Mentioned \\
% \hline
% Safety or Efficacy & 14.5\% \\
% Funding Problems & 9.1\% \\
% Enrollment Issues & 31\% \\
% \hline
% \end{tabular}
% \end{center}
% \end{table}
This paper proposes the first model to separate the causal effects of
market conditions (a strategic concern) from the effects of
participant enrollment (an operational concern) on Phase III Clinical trials.
This will allow me to answer the questions:
\begin{itemize}
\item What is the marginal effect on trial completion of an additional
generic drug on the market?
\item What is the marginal effect on trial completion of a delay in
closing enrollment?
\end{itemize}
To undderstand how I do this, we'll cover some background information on
clinical trials in section \ref{SEC:ClinicalTrials},
explain the data in section \ref{SEC:DataSources},
and then examine causal identification and econometric model in sections
\ref{SEC:CausalIdentificationAndModel}.
Finally I'll review the results and conclusion in sections
\ref{SEC:Results}
and
\ref{SEC:Conclusion}
respectively.
% \subsection{Market incentives and drug development}
% %%%%%%%%% What do we know about drug development incentives?
%
% \cite{dranove_DoesConsumer_2022} use the implementation of Medicare part D
% to examine whether the production of novel or follow up drugs increases during
% the following 15 years.
% They find that when Medicare part D was implemented -- increasing senior
% citizens' ability to pay for drugs -- there was a (delayed) increase
% in drug development, with effects concentrated among compounds that were least
% innovative according to their classification of innovations.
% They suggest that this is due to financial risk management, as novel
% pharmaceuticals have a higher probability of failure compared to the less novel
% follow up development.
% This is what leads risk-adverse companies to prefer follow up development.
%
%
% % Acemoglu and Linn
% % - Market size in innovation
% % - Exogenous demographic trends has a large impact on the entry of non-generic drugs and new molecular entitites.
% On the side of market analysis,
% \citeauthor{acemoglu_market_2004}
% (\citeyear{acemoglu_market_2004})
% used exogenous deomographics changes to show that the
% entry of novel compounds is highly driven by the underlying aged population.
% They estimate that a 1\% increase in applicable demographics increase the
% entry of new drugs by 6\%, mostly concentrated among generics.
% Among non-generics, a 1\% increase in potential market size
% (as measured by demographic groups) leads to a 4\% increase in novel therapies.
%
% % Gupta
% % - Inperfect intellectual property rights in the pharmaceutical industry
% \cite{gupta_OneProduct_2020} discovered that uncertainty around which patents
% might apply to a novel drug causes a delay in the entry of generics after
% the primary patent has expired.
% She found that this delay in delivery is around 3 years.
%
% % Agarwal and Gaule 2022
% % - Retrospective on impact from COVID-19 pandemic
% % Not in this version
%
% \subsection{Understanding Failures in Drug Development}
%
% % DISCUSS: Different types of failures
% There are myriad of reasons that a drug candidate may not make it to market,
% regardless of it's novelty or known safety.
% In this work, I focus on the failure of individual clinical trials, but the
% categories of failure apply to the individual trials as well as the entire
% drug development pipeline.
% They generally fall into one of the following categories:
% \begin{itemize}
% \item Scientific Failure: When there are issues regarding
% safety and efficacy that must be addressed.
% The preeminient question is:
% ``Will the drug work for patients?''
% %E.Khm, Gupta, etc.
% \item Strategic Failure: When the sponsors stop development because of
% profitability
% %Whether or not the drug will be profitiable, or align with
% %the drug developer's future Research \& Development directions i.e.
% ``Will producing the drug be beneficial to the
% company in the long term?''
% %E.Khm, Gupta, GLP-1s, etc.
% \item Operational concerns are answers to:
% %Whether or not the developer can successfully conduct
% %operations to meet scientific or strategic goals, i.e.
% ``What has prevented the the company from being able to
% finance, develop, produce, and market the drug?''
% \end{itemize}
% It is likely that a drug fails to complete the development cycle due to some
% combination of these factors.
%
%
% %USE MetaBio/CalBio GLP-1 story to illuistrate these different factors.
% \cite{flier_DrugDevelopment_2024} documents the case of MetaBio, a company
% he was involved in founding that was in the first stages of
% developing a GLP-1 based drug for diabetes or obesety before being shut down
% in .
% MetaBio was a wholy owned subsidiary of CalBio, a metabolic drug development
% firm, that recieved a \$30 million -- 5 year investment from Pfizer to
% persue development of GLP-1 based therapies.
% At the time it was shut down, it faced a few challenges:
% \begin{itemize}
% \item The compound had a short half life and they were seeking methods to
% improve it's effectiveness; a scientific failure.
% \item Pfizer imposed a requirement that it be delivered though a route
% other than injection (the known delivery mechanism); a strategic failure.
% \item When Pfizer pulled the plug, CalBio closed MetaBio because they
% could not find other funding sources; an operational failure.
% \end{itemize}
%
% The author states in his conclusion:
% \begin{displayquote}
% Despite every possibility of success,
% MetaBio went down because there were mistaken ideas about what was
% possible and what was not in the realm of metabolic therapeutics, and
% because proper corporate structure and adequate capital are always
% issues when attempting to survive predictable setbacks.
% \end{displayquote}
%
% From this we see that there was a cascade of issues leading to the failure to
% develop this novel drug.
%
%
% % I don't think I need to include modelling enrollment here.
% % If it is applicable, it can show up in those sections later.
%
%
\end{document}

@ -0,0 +1,93 @@
\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}
\begin{document}
% Clinical Trials Background Outline
% - ClinicalTrials.gov
% - Clincial trial progression
% -
% -
% -
% -
% -
% -
% -
To understand how my administrative clinical trial data is obtained
and what it can be used for,
let's take a look at how trial investigators record data on
\url{ClinicalTrials.gov} operate.
Figure \ref{Fig:Stages} illuistrates the process I describe below.
During the Pre-Trial period the trial investigators will design the trial,
choose primary and secondary objectives,
and decide on how many participants they need to enroll.
Once they have decided on these details, they post the trial to \url{ClinicalTrials.com}
and decide on a date to begin enrolling trial participants.
If the investigators decide to not continue with the trial before enrolling any participants,
the trial is marked as ``Withdrawn''.
On the other hand, if they begin enrolling participants, there are two methods to do so.
The first is to enter a general ``Recruiting'' state, where patients attempt to enroll.
The second is to enter an "Enrollment by invitation only" state.
After a trial has enrolled their participants, they wil typically move to an
"Active, not recruiting" state to inform potential participants that they are
not recruiting.
Finally, when the investigators have obtained enough data to achieve their primary
objective, the clinical trial will be closed, and marked as ``Completed'' in
\url{ClinicalTrials.gov}
If the trial is closed before achieving the primary objective, the trial is
marked as ``Terminated'' on
\url{ClinicalTrials.gov}.
\begin{figure}%[H] %use [H] to fix the figure here.
\includegraphics[width=\textwidth]{../assets/img/ClinicalTrialStagesAndStatuses}
\par \small
Diamonds represent decision points while
Squares represent states of the clinical trial and Rhombuses represend data obtained by the trial.
\caption[Clinical Trial Stages and Progression]{Clinical Trial Stages and Progression}
\label{Fig:Stages}
\end{figure}
Note the information we obtain about the trial from the final status:
``Withdrawn'', ``Terminated'', or ``Completed''.
Although \cite{khm} describes a clinical failure due to safety or efficacy as a
\textit{scientific} failure, it is better described as a compound failure.
Discovering that a compound doesn't work as hoped is not a failure but the whole
purpose of the clinical trials process.
On the other hand, when a trial terminates early due to reasons
other than safety or efficacy concerns, the trial operator does not learn
if the drug is effective or safe.
This is a true failure in that we did not learn if the drug was effective or not.
Unfortunately, although termination documentation typically includes a
description of a reason for the clinical trial termination, this doesn't necessarily
list all the reasons contributing to the trial termination and may not exist for a given trial.
As a trial goes through the different stages of recruitment, the investigators
update the records on ClinicalTrials.gov.
Even though there are only a few times that investigators are required
to update this information, it tends to be updated somewhat regularly as it is
a way to communicate with potential enrollees.
When a trial is first posted, it tends to include information
such as planned enrollment,
planned end dates,
the sites at which it is being conducted,
the diseases that it is investigating,
the drugs or other treatments that will be used,
the experimental arms that will be used,
and who is sponsoring the trial.
As enrollment is opened and closed and sites are added or removed,
investigators will update the status and information
to help doctors and potential participants understand whether they should apply.
% -
% -
% -
% -
% -
% -
\end{document}

@ -0,0 +1,54 @@
--get a list of the most recent activations that exist for a given application.
create temp table nsde_activations as
select
application_number_or_citation,
count(distinct package_ndc) as package_count,
max(marketing_start_date) as most_recent_start,
max(marketing_end_date) as most_recent_end,
max(inactivation_date) as most_recent_inactivation,
max(reactivation_date) as most_recent_reactivation
from spl.nsde
group by application_number_or_citation
;
select count(*) from nsde_activations
where most_recent_end is null
;
/*
count
-----
6602
*/
select count(*) from nsde_activations
where most_recent_end is NOT null
;
/*
count
-----
10983
*/
/*
So, the current number of marketed compounds is how many NDA or ANDA (ANADA?) compounds there are.
*/
-- get count of drugs that you can select by first 3 letters
select
left(application_number_or_citation, 3) as first_3,
count(*) as row_count
from nsde_activations
group by first_3
;
select
left(application_number_or_citation, 3) as first_3,
count(*) as row_count
from nsde_activations
where first_3 in ()
group by first_3
;

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 357 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 263 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 270 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 330 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 394 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 343 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 364 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 261 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 169 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 195 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 172 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 259 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 268 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 258 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 261 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

@ -1,573 +0,0 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Beamer Presentation
% LaTeX Template
% Version 1.0 (10/11/12)
%
% This template has been downloaded from:
% http://www.LaTeXTemplates.com
%
% License:
% CC BY-NC-SA 3.0 (http://creativecommons.org/licenses/by-nc-sa/3.0/)
%
% Changed theme to WSU by William King
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%----------------------------------------------------------------------------------------
% PACKAGES AND THEMES
%----------------------------------------------------------------------------------------
\documentclass[xcolor=dvipsnames,aspectratio=169]{beamer}
%Import Preamble bits
\input{../assets/preambles/FormattingPreamble.tex}
\input{../assets/preambles/TikzitPreamble.tex}
\input{../assets/preambles/MathPreamble.tex}
\input{../assets/preambles/BibPreamble.tex}
\input{../assets/preambles/GeneralPreamble.tex}
%----------------------------------------------------------------------------------------
% TITLE PAGE
%----------------------------------------------------------------------------------------
\title[Clinical Trials]{The Effects of Market Conditions on Recruitment and Completion of Clinical Trials}
\author{Will King} % Your name
\institute[WSU] % Your institution as it will appear on the bottom of every slide, may be shorthand to save space
{
Washington State University \\ % Your institution for the title page
\medskip
\textit{william.f.king@wsu.edu} % Your email address
}
\date{\today} % Date, can be changed to a custom date
\begin{document}
\begin{frame}
\titlepage % Print the title page as the first slide
\end{frame}
%----------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Clincial Trials} % Table of contents slide, comment this out to remove it
% - Intro and hook (Clinical Trials are key part of pharmacological pipeline)
Pharmaceuticals are a frequently discussed aspect of health care cost managment.
Their development is dictated by scientific and regulatory hurdles
including passing clinical trials
(\cite{noauthor_fda_nodate}),
while their market is characterized by strategic competition and ambiguous
patent protection
(\cite{van_der_gronde_addressing_2017}).
\vspace{12pt}
This research investigates the pathways by which market conditions
affect clinical trial completion.
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{This research}
\textbf{Questions:}
\begin{enumerate}
\item Does the existence of alternative drugs on the market make it
harder for clinical trials to complete successfully?
\item How much of this is occurs due to increased recruitment difficulty?
\end{enumerate}
\end{frame}
%--------------------------------
\begin{frame}
\frametitle{Thanks} % Table of contents slide, comment this out to remove it
Thanks to Chris Adams and Rebecca Sachs of the Congressional Budget Office.
\end{frame}
%--------------------------------
\begin{frame}[allowframebreaks] %Allow frame breaks
\frametitle{Overview} % Table of contents slide, comment this out to remove it
\tableofcontents
% - Intro and hook
% - Literature review
% - Causal Identification
% - Data
% - Econometric model
% - Results
% - Improvements
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Lit Review %%%%%%%%%%%%%%%%%%%%%%%%
\section{Lit Review}
% First slide:
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Literature Highlights}
\begin{itemize}
\item \cite{van_der_gronde_addressing_2017}:
High level synthesis of overall discussion regarding drug costs.
Both academic and non-academic sources.
\item \cite{hwang_failure_2016}:
Answered the question "Why do late-stage (phase III) trials fail?"
Found that efficacy, safety, and competition reasons accounted for
57\%, 17\%, and 22\% respectively.
\item \cite{abrantes-metz_pharmaceutical_2004}:
Described how drugs progress through the 3 phases of clinical trials
and correllations between various trial characteristics and the
clinical trial failures.
\item \cite{khmelnitskaya_competition_2021}:
Modeled clinical trial lifecycle of drugs, found method to separate
scientific from competitive reasons for failure to progress to the
next phase.
% \item \cite{}:
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{This research, in context}
In contrast to previous work looking at multiple phases of trials,
I seek to figure out what causes individual trials to fail.
\vspace{12pt}
Instead of focusing on the drug development pipeline, I attempt to
investigate the population of drug-based, phase III trials.
\end{frame}
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Why this approach?} % Table of contents slide, comment this out to remove it
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/methodology_trial.png}
\label{FIG:xkcd2726}
\caption{``If you think THAT'S unethical, you should see the stuff we approved via our Placebo IRB.''
- \url{https://xkcd.com/2726}
}
\end{figure}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Causal Identification / DGP%%%%%%%%%%%%%%%%%%%%%%%%
\section{Causal Model}
% Data Generating process
% - Agents and their decisions
% - Factors that influence each decision
% -
% -
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Data Generating Process}
% study sponsors
Study Sponsors Decide to start a Phase 3 trial and whether to terminate it.
\\
They ask themselves:
\begin{itemize}
\item Do safety incidents require terminating a trial?
\item Do efficacy results indicate the trial is worth continuing?
\item Is recruiting sufficient to achieve our results and contain costs?
\item Do expectations about future returns justify our expenditures?
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Data Generating Process}
% participants
Participants decide to enroll (and disenroll) themselves in a trial based
\begin{itemize}
\item Disease severity
\item Relative safety/efficacy compared to other treatments
\end{itemize}
Study sponsors plan their enrollment considering
\begin{itemize}
\item Total population affected
\item Likely participant response rates
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Data Generating Process}
% Trial Snapshots and dependencies.
During a trial, the study sponsor reports snapshots of their trial.
This includes updates to:
\begin{itemize}
\item enrollment (actual or anticipated)
\item current recruitment status (Recruiting, Active not recruiting, etc)
\item study sponsor
\item planned completion dates
\item elapsed duration
\end{itemize}
Note that final enrollment and the final status (Completed or Terminated)
of the trial are jointly determined.
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Causal Diagram: Key Pathways}
% Estimating Direct vs Total Effects
\begin{figure}
\resizebox{!}{0.5\textheight}{
\tikzfig{../assets/tikzit/CausalGraph}
}
\label{FIG:CausalDiagram}
\caption{Causal Diagram highlighting direct and total pathways}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Causal Diagram: Backdoor Crieterion}
\small
\begin{block}{$d$-separation}
A set $S$ of nodes blocks a path $p$ if either
\begin{enumerate}
\item $p$ contains at least one arrow-emitting node in $S$
\item $p$ contains at least one collision node $c$ that is outside $S$
and has no descendants in $S$.
\end{enumerate}
If $S$ blocks all paths from X to Y, then it is said to ``$d$-separate''
$X$ and $Y$, and then $X \perp Y | S$.
\end{block}
\begin{block}{Back-Door Criterion}
A set $S$ of covariates is admisible as controls on the
causal relationship $X \rightarrow Y$ if:
\begin{enumerate}
\item No element of $S$ is a decendant of $X$
\item The elements of $S$ d-separate all paths from $X$ to $Y$ that include
parents of $X$.
\end{enumerate}
\end{block}
\cite{pearl_causality_2000}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Causal Diagram}
Key takeaways
\begin{itemize}
\item Measuring enrollment prior to trial completion is necessary for causal identification.
\item The backdoor criterion gives us the following adjustment sets:
\begin{itemize}
\item Total Effect for Market on Termination; Population, Condition, Phase III
\item Direct Effects for Enrollment, Market on Termination; Population, Condition Phase III,
Elapsed Duration, Planned Enrollment
\end{itemize}
\item Enrollment requires imputation
\end{itemize}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Data %%%%%%%%%%%%%%%%%%%%%%%%
\section{Data}
%-------------------------------------------------------------------------------------
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Sources
\subsection{Sources}
%----------------------------------
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Data Sources}
\begin{itemize}
\item ClinicalTrials.gov - AACT \& custom scripts
\begin{itemize}
\item Select trials of interest
\item Trial details:
\begin{itemize}
\item conditions
\item final status
\item drugs/interventions
\end{itemize}
\item Trial snapshots:
\begin{itemize}
\item enrollment (anticipated, planned, or actual)
\item elapsed duration
\item current status
\end{itemize}
\end{itemize}
\item Medical Subject Headings (MeSH) Thesaurus
\begin{itemize}
\item A standardized nomenclature used to classify interventions
and conditions in the clinical trials database.
\end{itemize}
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Data Sources}
\begin{itemize}
\item NSDE Files (New drug code Structured product labels Data Element)
\begin{itemize}
\item Contains information about when a given drug was on the market.
\end{itemize}
\item RxNorm
\begin{itemize}
\item Links pharmaceuticals between MeSH standardized terms and
NSDE files.
\end{itemize}
\item Global Disease Burden Survey (2019)
\begin{itemize}
\item Estimates of DALYs for categories of disease
\item Links of Categories to ICD-10 Codes
\end{itemize}
\item ICD-10 (2019)
\begin{itemize}
\item WHO version
\item CMS version (Clinical Managment)
\item Used to group disease conditions in hierarchal model
\end{itemize}
\item Unified Medical Language System Thesaurus
\begin{itemize}
\item Used to link MeSH standardized terms and ICD-10 conditions
\item Manual matching process
\end{itemize}
\end{itemize}
\end{frame}
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Integration
\subsection{Integration}
%----------------------------------
%-------------------------------
\begin{frame}
\frametitle{Data Summaries}
%put summaries now
\begin{itemize}
\item Number of Phase III, FDA monitored Drug Trials: 1,981
\item Number of Trials matched to ICD-10: 186
\item Number of Trials matched to ICD-10 with population measures: 67
(51 completed, 16 terminated)
\item Number of Snapshots: 616
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Data used}
The following data points were used.
\begin{itemize}
\item elapsed duration
\item asinh(number of brands)
\item asinh(high sdi DALY estimate)
\item asinh(high-medium sdi DALY estimate)
\item asinh(medium sdi DALY estimate)
\item asinh(low-medium sdi DALY estimate)
\item asinh(low sdi DALY estimate)
\end{itemize}
The asinh operator was used because it parallells $\text{ln}(x)$ for
large values of $x$ but also handles $\text{asinh}(0)=0$.
\end{frame}
%----------------------------------
\begin{frame}
\frametitle{Summaries: Trial Durations}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_durations_hist.png}
\label{FIG:durations}
\caption{Trial Durations (days)}
\end{figure}
\end{frame}
%----------------------------------
\begin{frame}
\frametitle{Summaries: snapshots}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_snapshots_hist.png}
\label{FIG:snapshots}
\caption{Number of Snapshots per matched trial}
\end{figure}
\end{frame}
%----------------------------------
\begin{frame}
\frametitle{Summaries: snapshots}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_status_duration_snapshots_points.png}
\label{FIG:snapshot_duration_scatter}
\caption{Scatterplot of snapshot count and durations}
\end{figure}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Econometric Model %%%%%%%%%%%%%%%%%%%%%%%%
\section{Econometric model}
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Econometric Model}
Estimating the total effect of brands on market
\begin{align}
y_n &\sim \text{Bernoulli}(p_n) \\
p_n &= \text{logisticfn}(x_n * \beta(d_n)) \\
\beta_k(d) &\sim \text{Normal}(\mu_k, \sigma_k) \\
\mu_k &\sim \text{Normal}(0,1) \\
\sigma_k &\sim \text{Gamma}(2,1)
\end{align}
$k$ indexes parameters and $d_n$ represets the ICD-10 group the trial corresponds to.
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Results %%%%%%%%%%%%%%%%%%%%%%%%
\section{Results}
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Results}
Because bayesian estimation is typically done numerically, we will first
validate convergence.
Then we will take a look at preliminary results.
Sampling details
\begin{itemize}
\item 6 chains
\item 2,500 warmup, 2,500 sampling runs
\item seed = 11021585
\end{itemize}
\end{frame}
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Convergence Tests
\subsection{Convergence}
%----------------------------------
%-------------------------------
\begin{frame}
\frametitle{Warnings}
\begin{itemize}
\item There were no diverging transitions.
\item There were 15,000 transitions that exceeded max treedepth.
Sampling efficiency is poor.
\item All chains had low Bayesian Fraction of Missing Information.
Some areas of the distribution were poorly explored.
\item R-hat = $1.23$, ideal is around 1, chains did not mix well.
\item Bulk and Tail Effective Sample sizes were low,
suggesting mean and variance/quantile estimates will be unreliable.
\end{itemize}
\cite{mc-stan}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Convergence: Mu}
\begin{figure}
\includegraphics[height=0.9\textheight]{../assets/img/2023-04-11_mu_points.png}
\label{FIG:caption}
\caption{Hyperparameter Points Plots: Mu}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Convergence: Sigma}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-11_sigma_points.png}
\label{FIG:caption}
\caption{Hyperparameter Points Plots: Sigma}
\end{figure}
\end{frame}
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Preliminary Results
\subsection{Preliminary Results}
%----------------------------------
%-------------------------------
\begin{frame}
\frametitle{Preliminary Results: Mu}
\begin{columns}
\begin{column}{0.3\textwidth}
\begin{enumerate}
\item elapsed duration
\item asinh(n\_brands)
\item asinh(high sdi)
\item asinh(high-medium sdi)
\item asinh(medium sdi)
\item asinh(low-medium sdi)
\item asinh(low sdi)
\end{enumerate}
\end{column}
\begin{column}{0.7\textwidth}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-11_mu_dist.png}
\label{FIG:caption}
\caption{Hyperparameter Distribution: Mu}
\end{figure}
\end{column}
\end{columns}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Preliminary Results: Sigma}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/2023-04-11_sigma_dist.png}
\label{FIG:caption}
\caption{Hyperparameter Distribution: Sigma}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Interpretation}
All of the following interpretations are done in the context of insufficient data
\begin{enumerate}
\item Elapsed Duration (Mu[1]): Trending Negative, reduced probability of termination.
\item Number of Brands(Mu[2]): Trending Positive, increased probability of termination.
\item Population Measures (Mu[3]-Mu[7])
\begin{enumerate}
\item What is most surprising is that these are both positive and negative.
Probably need more data.
\end{enumerate}
\item It is surprising to see the wide distribution in sigma values.
\end{enumerate}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Improvements %%%%%%%%%%%%%%%%%%%%%%%%
\section{Improvements}
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Proposed improvements}
\begin{enumerate}
\item Match more trials to ICD-10 codes
\item Improve Measures of Market Conditions
\item Adjust Covariance Structure
\item Find Reasonable Priors
\item Remove disease categories that don't exist in the data from the priors
\item Imputing Enrollment
\item Improve Population Estimates
\end{enumerate}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\center{\huge{Questions?}}
\end{frame}
%-------------------------------
\begin{frame}[allowframebreaks]
\frametitle{Bibliography}
\printbibliography
\end{frame}
%-------------------------------
\end{document}
%=========================================
%\begin{frame}
% \frametitle{MarginalRevenue}
% \begin{figure}
% \tikzfig{../Assets/owned/ch8_MarginalRevenue}
% \includegraphics[height=\textheight]{../Assets/copyrighted/KrugmanObsterfeldMeliz_fig8-7.jpg}
% \label{FIG:costs}
% \caption{Average Cost Curve as firms enter.}
% \end{figure}
%\end{frame}
%-------------------------------
%\begin{frame}
% \frametitle{Columns}
% \begin{columns}
% \begin{column}{0.5\textwidth}
% \end{column}
% \begin{column}{0.5\textwidth}
% \begin{figure}
% \tikzfig{../Assets/owned/ch7_EstablishedAdvantageExample2}
% \label{FIG:costs}
% \caption{Setting the Stage}
% \end{figure}
% \end{column}
% \end{columns}
%\end{frame}
% %---------------------------------------------------------------

@ -0,0 +1,916 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Beamer Presentation
% LaTeX Template
% Version 1.0 (10/11/12)
%
% This template has been downloaded from:
% http://www.LaTeXTemplates.com
%
% License:
% CC BY-NC-SA 3.0 (http://creativecommons.org/licenses/by-nc-sa/3.0/)
%
% Changed theme to WSU by William King
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%----------------------------------------------------------------------------------------
% PACKAGES AND THEMES
%----------------------------------------------------------------------------------------
\documentclass[xcolor=dvipsnames,aspectratio=169]{beamer}
%Import Preamble bits
\input{../assets/preambles/FormattingPreamble.tex}
\input{../assets/preambles/TikzitPreamble.tex}
\input{../assets/preambles/MathPreamble.tex}
\input{../assets/preambles/BibPreamble.tex}
\input{../assets/preambles/GeneralPreamble.tex}
%----------------------------------------------------------------------------------------
% TITLE PAGE
%----------------------------------------------------------------------------------------
\title[Clinical Trials]{The Effects of Market Conditions on Recruitment and Completion of Clinical Trials}
\author{Will King} % Your name
\institute[WSU] % Your institution as it will appear on the bottom of every slide, may be shorthand to save space
{
Washington State University \\ % Your institution for the title page
\medskip
\textit{william.f.king@wsu.edu} % Your email address
}
\date{\today} % Date, can be changed to a custom date
\begin{document}
\begin{frame}
\titlepage % Print the title page as the first slide
\end{frame}
%----------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Clinical Trials} % Table of contents slide, comment this out to remove it
% - Intro and hook (Clinical Trials are key part of pharmacological pipeline)
Pharmaceuticals are a frequently discussed aspect of health care cost management.
Their development is dictated by scientific and regulatory hurdles
including passing clinical trials
(\cite{noauthor_fda_nodate}),
while their market is characterized by strategic competition and ambiguous
patent protection
(\cite{van_der_gronde_addressing_2017}).
\vspace{12pt}
This research investigates the ways by which market conditions
affect clinical trial completion.
\end{frame}
%--------------------------------
\begin{frame}[allowframebreaks] %Allow frame breaks
\frametitle{Overview} % Table of contents slide, comment this out to remove it
\tableofcontents
% - Intro and hook
% - Literature review
% - Causal Identification
% - Data
% - Econometric model
% - Results
% - Improvements
\end{frame}
%-------------------------------
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Introduction and Background %%%%%%%%%%%%%%%%%%%%%%%%
\section{Background}
% TOC
% - Background on drug process
% - Literature on clinical trials
% - My questions
% add info about trials
% - Requirements (pre registered design [2007], updated "regularly" on clinicaltrials.gov)
% - Phases (1,2,3,4, mixed)
% - Safety and Ethicas (oversight boards, restrictions on payments)
% - Approval processes (biologics vs small-molecule)
% add info about drugs
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Clinical Trials and Drug develoment}
The FDA requires clinical trials before approving new drug compounds
\begin{itemize}
\item Pre-registered design
\item Updated regularly on clinicaltrials.gov
\item Often requires an oversight board.
\item Goal is to prove efficacy and safety of a compound/dosage/route.
\item A new drug candidate (NDC) must complete 3 phases of clinical trials before approval.
\item Phases are reviewed with FDA.
\item Not all clinical trials are for new drugs.
\end{itemize}
\end{frame}
%-----------------------------
\begin{frame}
\frametitle{Literature Highlights}
\begin{itemize}
\item \cite{van_der_gronde_addressing_2017}:
High level synthesis of overall discussion regarding drug costs.
Both academic and non-academic sources.
\item \cite{hwang_failure_2016}:
Answered the question "Why do late-stage (phase III) trials fail?"
Found that efficacy, safety, and competition reasons accounted for
57\%, 17\%, and 22\% respectively.
\item \cite{abrantes-metz_pharmaceutical_2004}:
Described how drugs progress through the 3 phases of clinical trials
and correlations between various trial characteristics and the
clinical trial failures.
\item \cite{khmelnitskaya_competition_2021}:
Modeled clinical trial life-cycle of drugs, found method to separate
scientific from competitive reasons for failure to progress to the
next phase.
% \item \cite{}:
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{This research, in context}
In contrast to previous work looking at multiple phases of trials,
I seek to figure out what causes individual trials to fail.
% \vspace{12pt}
%
% Instead of focusing on the drug development pipeline, I attempt to
% investigate the population of drug-based, phase III trials.
%
\vspace{12pt}
Questions
\begin{itemize}
\item How do the competitors on the market affect clinical trial completion?
\item How is this effect moderated by the enrollment of participants?
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Audience Questions}
\center{What can I clarify?}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Causality and Data %%%%%%%%%%%%%%%%%%%%%%%%
\section{Causal Story and Data}
% TOC
% - Causal Story (no subsection)
% - Clinical trials: targets specific drug/condition combination.
% - Enrollment process: patients counsel with providers
% - Trials terminate if unsafe, ineffective, unprofitable, or cannot enroll patients
% - Ethical concerns exist throughout.
% - This is complicated by the fact that the experiment reveals information over time.
% - Formalization
% - Data Sources
% Data Generating process
% - Agents and their decisions
% - Factors that influence each decision
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}[shrink=10] %evil option is helpful here.
\frametitle{How do clinical trials proceed?}
\begin{columns}[T]
\begin{column}{0.5\textwidth}
What does a \textif{Completed} trial look like?
\begin{enumerate}
\item Study sponsor comes up with design
\item Apply for NCT ID from ClinicalTrials.gov
\item Begin enrolling participants
\item Update ClinicalTrials.gov to recruit
\item Close Enrollment
\item Update ClinicalTrials.gov as not recruiting*
\item Reach primary objectives
\item Update ClinicalTrials.gov as complete
\item Reach secondary objectives
\item Update ClinicalTrials.gov with more information
\end{enumerate}
\end{column}
\begin{column}{0.5\textwidth}
What does an \textif{Terminated} trial look like?
\begin{enumerate}
\item Study sponsor comes up with design
\item Apply for NCT ID from ClinicalTrials.gov
\item Begin enrolling participants
\item Update ClinicalTrials.gov to advertise
\item Run into issues:
\begin{itemize}
\item Safety
\item Efficacy
\item Profitability
\item Feasiblity (enrollment, PI leaves, etc.)
\end{itemize}
\item Close Enrollment*
\item Decide to terminate clinical trial.
\item Update ClinicalTrials.gov as terminated.
\end{enumerate}
\end{column}
\end{columns}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{ClinicalTrials.gov}
Thus ClinicalTrials.gov becomes an (append only) repository of
the ``current'' status of clincal trials.
As it is designed to help faciltate enrollment in clinical trials,
the record includes important information such as
\begin{itemize}
\item drugs
\item study arms
\item conditions
\item expected and final enrollment figures
\item current status
\end{itemize}
ClinicalTrials.gov also reports the history from previous
updates.
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Decision-Making Process}
% study sponsors
Study Sponsors Decide to start a Phase 3 trial and whether to terminate it.
\\
They ask themselves:
\begin{itemize}
\item Do safety incidents require terminating a trial?
\item Do efficacy results indicate the trial is worth continuing?
\item Is recruiting sufficient to achieve our results and contain costs?
\item Do expectations about future returns justify our expenditures?
\end{itemize}
They plan their enrollment considering
\begin{itemize}
\item Total population affected
\item Likely participant response rates
\item Their network of clinicians
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Decision-Making Process}
% participants
Participants decide to enroll (and dis-enroll) themselves in a trial based on
\begin{itemize}
\item Doctor Recommendations
\item Disease severity
\item Relative safety/efficacy compared to other treatments
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\center{What clarifying questions do you have?}
\end{frame}
%-------------------------------
%--------------------------------
%%%%%%%%%%%%%%%%%%%% Causal Formalization
\subsection{Formalization}
% - Introduce basic triangle
% - discuss total vs direct effects
% -
% - Add confounders and controls
% - Introduce backdoor criterion
%--------------------------------
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Why this approach?}
\begin{figure}
\includegraphics[height=0.8\textheight]{../assets/img/methodology_trial.png}
\label{FIG:xkcd2726}
\caption{``If you think THAT'S unethical, you should see the stuff we approved via our Placebo IRB.''
- \url{https://xkcd.com/2726}
}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Framing my Questions}
\begin{columns}[T]
\begin{column}{0.5\textwidth}
Two potential causes of trial termination include
\begin{enumerate}
\item Alternative (competitor) treatments exist
\begin{itemize}
\item reduces future profitability.
\item reduces incentives to enroll as participants.
\end{itemize}
\item It can be difficult to recruit patients
\begin{itemize}
\item Are there few patients?
\item Are potential participants choosing other alternatives?
\end{itemize}
\end{enumerate}
\end{column}
\begin{column}{0.5\textwidth}
Overall this can be described graphically as:
\begin{figure}
\scalebox{0.8}{
\tikzfig{../assets/tikzit/4Node}
}
\label{FIG:4Node}
\caption{Total Effect}
\end{figure}
\end{column}
\end{columns}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Causal Effects}
%Discuss the two different effects: total effect, direct effects
\begin{columns}
\begin{column}{0.5\textwidth}
Total Effect of Competitors
\begin{figure}
\scalebox{0.8}{
\tikzfig{../assets/tikzit/4Node_total}
}
\label{FIG:4Node}
\caption{Total Effect}
\end{figure}
\end{column}
\begin{column}{0.5\textwidth}
Direct Effects of Competitors and Enrollment
\begin{figure}
\scalebox{0.8}{
\tikzfig{../assets/tikzit/4Node_direct}
}
\label{FIG:4Node}
\caption{Direct Effect}
\end{figure}
\end{column}
\end{columns}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Rephrasing Questions}
To rephrase my questions
\begin{enumerate}
\item How large is the total effect of increasing the number
of competing drugs on completing clinical trials?
\item How large is the direct effect of increasing the number
of competing drugs on completing clincial trials?
\end{enumerate}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Additional Concerns}
%Confounders
Of course, there are other confounding relationships
\begin{enumerate}
\item Population Effects
\item Fundamental Safety and Efficacy of compound/dosage/route
\item How long it is taking
\item
\end{enumerate}
%TODO: Fill out with more details from graph
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Complete graph}
%introduce backdoor criterion
\begin{figure}
\scalebox{0.6}{
\tikzfig{../assets/tikzit/CausalGraph}
}
\label{FIG:CausalGraph}
\caption{Full Causal Graph}
\end{figure}
Discuss concerns about Elapsed Duration and Enrollment
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Causal Diagram: Backdoor Criterion}
\small
\begin{block}{$d$-separation}
A set $S$ of nodes blocks a path $p$ on a DAG if either
\begin{enumerate}
\item $p$ contains at least one arrow-emitting node in $S$
\item $p$ contains at least one collision node $c$ that is outside $S$
and has no descendants in $S$.
\end{enumerate}
If $S$ blocks all paths from X to Y, then it is said to ``$d$-separate''
$X$ and $Y$, and then $X \perp Y | S$.
\end{block}
\begin{block}{Back-Door Criterion}
A set $S$ of covariates is admissible as controls on the
causal relationship $X \rightarrow Y$ if:
\begin{enumerate}
\item No element of $S$ is a descendant of $X$
\item The elements of $S$ d-separate all paths from $X$ to $Y$ that include
parents of $X$.
\end{enumerate}
\end{block}
\cite{pearl_causality_2000}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Sufficent Adjustment Set}
%introduce backdoor criterion
Thus the required adjustment set depends on the effects of interest.
For the total effect these are controls for:
\begin{itemize}
\item Proceed with Phase III
\item Condition
\item Population
\end{itemize}
Discuss Regime Switching
For the direct effect these are controls for:
\begin{itemize}
\item Proceed with Phase III
\item Condition
\item Population (optional)
\item Enrollment
\end{itemize}
Not causally identified due to Regime Switching
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Other testable hypotheses}
One advantage of this approach tools can automatically
\begin{itemize}
\item verify causal identification
\item generate hypotheses to verify model
\end{itemize}
Automatic hypotheses
% \begin{itemize}
% \item Condition $\perp$ Elapsed Duration
% \item Decision to continue Phase III $\perp$ Elapsed Duration
% \item Decision to continue Phase III $\perp$ Market Conditions | Condition
% \item Decision to continue Phase III $\perp$ Population | Condition
% \item Elapsed Duration $\perp$ Market Conditions
% \item Elapsed Duration $\perp$ Population
% \item Terminated $\perp$ Population | Condition, Decision to continue Phase III, Elapsed Duration, Enrollment Status, Market Conditions
% \end{itemize}
% \href{Dagitty.net model}{http://dagitty.net/mLyFuc5}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\end{frame}
%-------------------------------
%--------------------------------
%%%%%%%%%%%%%%%%%%%% Data sources
\subsection{Data Sources}
% TOC
% - Main Data Sources
% - ClinicalTrials.gov and AACT
% - IHME Burden of Disease
% - Marketing Data
% - MeSH, RxNorm/RxNav
% - How did I Link Data Sources
% - Data Sizes
%--------------------------------
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Data Sources}
\begin{itemize}
\item ClinicalTrials.gov - AACT \& custom scripts
\begin{itemize}
\item Select trials of interest
\item Trial details:
\begin{itemize}
\item conditions
\item final status
\item drugs/interventions
\end{itemize}
\item Trial snapshots:
\begin{itemize}
\item elapsed duration
\item enrollment status (NYE,EBI,R,ANR)
\end{itemize}
\end{itemize}
\item Medical Subject Headings (MeSH) Thesaurus
\begin{itemize}
\item A standardized nomenclature used to classify interventions
and conditions in the clinical trials database.
\end{itemize}
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Data Sources}
\begin{itemize}
\item USP Drug Classification (2023)
\begin{itemize}
\item Used to measure which drugs
\end{itemize}
\item NSDE Files (New drug code Structured product labels Data Element)
\begin{itemize}
\item Contains information about when a given drug was on the market.
\end{itemize}
\item RxNorm
\begin{itemize}
\item Links pharmaceuticals between MeSH standardized terms and
NSDE files.
\item Used to find brand names that share active ingredients with those from trial.
\end{itemize}
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame} %Allow frame breaks
\frametitle{Data Sources}
\begin{itemize}
\item Global Disease Burden Survey (2019)
\begin{itemize}
\item Estimates of DALYs for categories of disease
\item Links of Categories to ICD-10 Codes
\end{itemize}
\item ICD-10 (2019)
\begin{itemize}
\item WHO version
\item CMS version (Clinical Management)
\item Used to group disease conditions in hierarchical model
\end{itemize}
\item Unified Medical Language System Thesaurus
\begin{itemize}
\item Used to link MeSH standardized terms and ICD-10 conditions
\item Manual matching process
\end{itemize}
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Linking data}
%
The following linking process was used:
\begin{enumerate}
\item AACT trials to snapshots (internal ID)
\item AACT trials to ICD-10 (hand match)
\item ICD-10 to IHME (IHME)
\item Snapshots to drug brands (RxNorm/RxNav/MeSh, SPL)
\item AACT to USP DC alternates (RxNorm, USP DC, hand match)
\end{enumerate}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Data used}
The following data points were used.
\begin{itemize}
\item elapsed duration
\item enrollment status
\item asinh(brands with identical ingredients)
\item asinh(brands in USP-DC category)
\item asinh(high sdi DALY estimate)
\item asinh(high-medium sdi DALY estimate)
\item asinh(medium sdi DALY estimate)
\item asinh(low-medium sdi DALY estimate)
\item asinh(low sdi DALY estimate)
\end{itemize}
The asinh operator was used because it parallels $\text{ln}(x)$ for
large values of $x$ but also handles $\text{asinh}(0)=0$.
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Measures of Causes and Effects}
Here are the actual measures used for causes
\begin{itemize}
\item Final Status: Measured from AACT - status when trial is over.
\item Competitors on Market: Measured by the number of drugs
\begin{itemize}
\item with same active ingredients (at the time of the snapshot)
\item sharing the USP DC category and class (in 2023)
\end{itemize}
\end{itemize}
Effects are measured in parameter values and changes in probability
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Adjustment set}
Here are the actual measures of the adjustment set
\begin{itemize}
\item Enrollment: Measured by enrollment status at the snapshot level.
\item Elapsed Duration: Measured at snapshot level
by $\frac{\text{Current Date} - \text{Start Date}}{\text{Planned Completion Date} - \text{Start Date}}$
\item Population Measures
\begin{itemize}
\item IHME Global Disease Burden: DALYs, spread over 5 levels of the Social Development Index
\end{itemize}
\item Beliefs about safety \& efficacy: Restricted to Phase 3 trials.
\item Disease Type: Hierarchal parameters in model
\end{itemize}
Note the implicit conditioning on trials treating diseases with IHME data\footnote{
IHME does not track data for W61.62XD: Struck by duck, subsequent encounter
}.
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Other Details}
Other Trial Selection Criteria
\begin{itemize}
\item Interventional Study
\item Involved an FDA Regulated Drug
\item Phase 3 trial
\item Started after 2010-01-01
\item Ended before 2022-01-01
\end{itemize}
\end{frame}
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Summary
\subsection{Data Summary}
%----------------------------------
%-------------------------------
\begin{frame}
\frametitle{Data Summaries}
%TODO: Update
\begin{itemize}
\item Number of Phase III, FDA monitored Drug Trials: 1,981
\item Number of Trials matched to ICD-10:
\item Number of Trials matched to ICD-10 with population measures:
( completed, terminated)
\item Number of Snapshots:
\end{itemize}
\end{frame}
%%----------------------------------
%\begin{frame}
% \frametitle{Summaries: Trial Durations}
% \begin{figure}
% \includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_durations_hist.png}
% \label{FIG:durations}
% \caption{Trial Durations (days)}
% \end{figure}
%\end{frame}
%%----------------------------------
%\begin{frame}
% \frametitle{Summaries: snapshots}
% \begin{figure}
% \includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_snapshots_hist.png}
% \label{FIG:snapshots}
% \caption{Number of Snapshots per matched trial}
% \end{figure}
%\end{frame}
%%----------------------------------
%\begin{frame}
% \frametitle{Summaries: snapshots}
% \begin{figure}
% \includegraphics[height=0.8\textheight]{../assets/img/2023-04-12_status_duration_snapshots_points.png}
% \label{FIG:snapshot_duration_scatter}
% \caption{Scatterplot of snapshot count and durations}
% \end{figure}
%\end{frame}
%%-------------------------------
%\begin{frame}
% \frametitle{Questions?}
%
%\end{frame}
%-------------------------------
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Analysis %%%%%%%%%%%%%%%%%%%%%%%%
\section{Analysis}
% TOC
% - Review questions and datasets to use for each
% -
% -
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{General Approach}
\begin{itemize}
\item Logistic model
\item Bayesian Hierarchal model
\begin{itemize}
\item Allows for transfer learning between groups
\end{itemize}
\item Distribution of Predicted Differences
\begin{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Total Effects Model}
\begin{align}
y_i \sim \text{Bernoulli}(p_i) \\
p_i = \text{logistic}(X_i \vec\beta_{c(i)}) \\
\vec\beta_{c(i)} \sim \text{MvNormal}(\vec\mu,\vec\sigma I)
\end{align}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\end{frame}
%--------------------------------
%%%%%%%%%%%%%%%%%%%% Results
\subsection{Results}
%--------------------------------
%--------------------------------
\subsubsection{Total Effect}
% - Review Parameter Values
% - hyperparameters
% - Table of MLE
% - Distributions
% - betas
% - Table of MLE
% - Distributions
% - Review Posterior Prediction for interventions
%--------------------------------
%-------------------------------
\begin{frame}
\frametitle{Results}
Because Bayesian estimation is typically done numerically, we will first
validate convergence.
Then we will take a look at preliminary results.
Sampling details
%TODO: Update
\begin{itemize}
\item 6 chains
\item 2,500 warm-up, 2,500 sampling runs
\item seed = 11021585
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\end{frame}
%-------------------------------
%--------------------------------
\subsubsection{Direct Effects}
% - Review Parameter Values
% - hyperparameters
% - Table of MLE
% - Distributions
% - betas
% - Table of MLE
% - Distributions
% - Review Posterior Prediction for interventions
%--------------------------------
%-------------------------------
\begin{frame}
\frametitle{Convergence}
Sampling details
%TODO: UPDATE
\begin{itemize}
\item 6 chains
\item 2,500 warm-up, 2,500 sampling runs
\item seed = 11021585
\end{itemize}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Questions?}
\end{frame}
%-------------------------------
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Conclusion %%%%%%%%%%%%%%%%%%%%%%%%
\section{Conclusion}
%-------------------------------------------------------------------------------------
%-------------------------------
\begin{frame}
\frametitle{Proposed improvements}
\begin{enumerate}
\item Match more trials to ICD-10 codes and Formularies
\item Add more formularies
\item Remove disease categories that don't exist in the data from the priors
\item Imputing Enrollment
\end{enumerate}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Summary}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Final Questions}
\center{\huge{Time is yours to ask any remaining questions.}}
\end{frame}
%-------------------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%% Appendicies %%%%%%%%%%%%%%%%%%%%%%%%
\section{Appendices}
%-------------------------------------------------------------------------------------
%----------------------------------
%%%%%%%%%%%%%%%%%%%% Convergence Tests
\subsection{Convergence}
%----------------------------------
%-------------------------------
\begin{frame}
\frametitle{Warnings}
%TODO: UPDATE
\begin{itemize}
\item There were no diverging transitions.
\item There were 15,000 transitions that exceeded max treedepth.
Sampling efficiency is poor.
\item All chains had low Bayesian Fraction of Missing Information.
Some areas of the distribution were poorly explored.
\item R-hat = $1.23$, ideal is around 1, chains did not mix well.
\item Bulk and Tail Effective Sample sizes were low,
suggesting mean and variance/quantile estimates will be unreliable.
\end{itemize}
\cite{mc-stan}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Convergence: Mu}
\begin{figure}
%TODO: UPDATE
%\includegraphics[height=0.9\textheight]{../assets/img/2023-04-11_mu_points.png}
\label{FIG:caption}
\caption{Hyperparameter Points Plots: Mu}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}
\frametitle{Convergence: Sigma}
\begin{figure}
%TODO: UPDATE
%\includegraphics[height=0.9\textheight]{../assets/img/2023-04-11_mu_points.png}
\label{FIG:caption}
\caption{Hyperparameter Points Plots: Sigma}
\end{figure}
\end{frame}
%-------------------------------
\begin{frame}[allowframebreaks]
\frametitle{Bibliography}
\printbibliography
\end{frame}
%-------------------------------
\end{document}
%=========================================
%\begin{frame}
% \frametitle{MarginalRevenue}
% \begin{figure}
% \tikzfig{../Assets/owned/ch8_MarginalRevenue}
% \includegraphics[height=\textheight]{../Assets/copyrighted/KrugmanObsterfeldMeliz_fig8-7.jpg}
% \label{FIG:costs}
% \caption{Average Cost Curve as firms enter.}
% \end{figure}
%\end{frame}
%-------------------------------
%\begin{frame}
% \frametitle{Columns}
% \begin{columns}
% \begin{column}{0.5\textwidth}
% \end{column}
% \begin{column}{0.5\textwidth}
% \begin{figure}
% \tikzfig{../Assets/owned/ch7_EstablishedAdvantageExample2}
% \label{FIG:costs}
% \caption{Setting the Stage}
% \end{figure}
% \end{column}
% \end{columns}
%\end{frame}
% %---------------------------------------------------------------

@ -0,0 +1,76 @@
<mxfile host="app.diagrams.net" modified="2023-10-16T16:56:43.067Z" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:117.0) Gecko/20100101 Firefox/117.0" etag="KGTMuf8QjCsRL98EFcDT" version="22.0.4" type="device">
<diagram name="Page-1" id="I2jkAHik6Nw58ulx91uk">
<mxGraphModel dx="1194" dy="784" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="NYvqEB6TeUMlcUxJVjzP-12" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-4" target="NYvqEB6TeUMlcUxJVjzP-6">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-4" value="Enrollment" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="580" y="407" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-6" value="&lt;div&gt;Terminated&lt;/div&gt;" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="580" y="500" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-9" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-7" target="NYvqEB6TeUMlcUxJVjzP-6">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-13" style="rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-7" target="NYvqEB6TeUMlcUxJVjzP-4">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-7" value="Alternates" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="425" y="500" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-10" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-8" target="NYvqEB6TeUMlcUxJVjzP-7">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-11" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-8" target="NYvqEB6TeUMlcUxJVjzP-4">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-8" value="Population" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#e1d5e7;strokeColor=#9673a6;" vertex="1" parent="1">
<mxGeometry x="425" y="407" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-14" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-15" target="NYvqEB6TeUMlcUxJVjzP-16">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-15" value="Enrollment" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="580" y="600" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-16" value="&lt;div&gt;Terminated&lt;/div&gt;" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="580" y="693" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-17" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-19" target="NYvqEB6TeUMlcUxJVjzP-16">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-18" style="rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-19" target="NYvqEB6TeUMlcUxJVjzP-15">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-19" value="Alternates" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="1">
<mxGeometry x="425" y="693" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-22" value="Population" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="425" y="600" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-23" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-24" target="NYvqEB6TeUMlcUxJVjzP-25">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-24" value="Enrollment" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="1">
<mxGeometry x="580" y="790" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-25" value="&lt;div&gt;Terminated&lt;/div&gt;" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="580" y="883" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-26" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="NYvqEB6TeUMlcUxJVjzP-28" target="NYvqEB6TeUMlcUxJVjzP-25">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-28" value="Alternates" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="1">
<mxGeometry x="425" y="883" width="100" height="40" as="geometry" />
</mxCell>
<mxCell id="NYvqEB6TeUMlcUxJVjzP-29" value="Population" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="425" y="790" width="100" height="40" as="geometry" />
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>

@ -0,0 +1,157 @@
<mxfile host="app.diagrams.net" modified="2023-10-16T16:56:28.569Z" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:117.0) Gecko/20100101 Firefox/117.0" etag="NuyvuUXKfxc4dK4n2Szr" version="22.0.4" type="device">
<diagram name="Page-1" id="JzMD1Olg0EUQs1xPXrhH">
<mxGraphModel dx="1194" dy="784" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="qu7T5ML7RWsz-wKexjIw-20" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-7" target="qu7T5ML7RWsz-wKexjIw-19">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-23" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-7" target="qu7T5ML7RWsz-wKexjIw-17">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-7" value="Enroll / Track Participants" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#fff2cc;strokeColor=#d6b656;" vertex="1" parent="1">
<mxGeometry x="84" y="230" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-25" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-8" target="qu7T5ML7RWsz-wKexjIw-17">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-8" value="Track Participants" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#ffe6cc;strokeColor=#d79b00;" vertex="1" parent="1">
<mxGeometry x="376" y="230" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-14" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=1;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-7" target="qu7T5ML7RWsz-wKexjIw-7">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="204" y="350" />
<mxPoint x="84" y="350" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-15" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=1;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-8" target="qu7T5ML7RWsz-wKexjIw-8">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="496" y="350" />
<mxPoint x="376" y="350" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-17" value="Entries on Clinical Trials.gov" style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="1">
<mxGeometry x="376" y="383" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-27" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-18" target="qu7T5ML7RWsz-wKexjIw-26">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-18" value="Close Trial" style="whiteSpace=wrap;html=1;shape=mxgraph.basic.octagon2;align=center;verticalAlign=middle;dx=15;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="538" y="210" width="100" height="100" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-21" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-19" target="qu7T5ML7RWsz-wKexjIw-8">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-19" value="Close Enrollment" style="shape=hexagon;perimeter=hexagonPerimeter2;whiteSpace=wrap;html=1;fixedSize=1;" vertex="1" parent="1">
<mxGeometry x="226" y="230" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-22" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;entryPerimeter=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-8" target="qu7T5ML7RWsz-wKexjIw-18">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-29" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=1;entryY=0.5;entryDx=0;entryDy=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-26" target="qu7T5ML7RWsz-wKexjIw-17">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-26" value="Final Reports" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#e1d5e7;strokeColor=#9673a6;" vertex="1" parent="1">
<mxGeometry x="676" y="230" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-37" value="&lt;b style=&quot;font-size: 16px;&quot;&gt;Simplified Clinical Trial Timeline&lt;br&gt;&lt;/b&gt;" style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;" vertex="1" parent="1">
<mxGeometry x="224" y="120" width="424" height="30" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-39" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;entryPerimeter=0;exitX=0.5;exitY=0;exitDx=0;exitDy=0;dashed=1;dashPattern=1 2;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-19" target="qu7T5ML7RWsz-wKexjIw-18">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="286" y="170" />
<mxPoint x="588" y="170" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-40" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-42" target="qu7T5ML7RWsz-wKexjIw-51">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-41" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-42" target="qu7T5ML7RWsz-wKexjIw-47">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-62" style="rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.75;exitY=1;exitDx=0;exitDy=0;entryX=0;entryY=0;entryDx=0;entryDy=0;startArrow=classic;startFill=1;dashed=1;dashPattern=1 2;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-42" target="qu7T5ML7RWsz-wKexjIw-60">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-42" value="Enroll / Track Participants" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#fff2cc;strokeColor=#d6b656;" vertex="1" parent="1">
<mxGeometry x="80" y="620" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-43" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-44" target="qu7T5ML7RWsz-wKexjIw-47">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-61" style="rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=1;entryY=0;entryDx=0;entryDy=0;endArrow=classic;endFill=1;startArrow=classic;startFill=1;dashed=1;dashPattern=1 2;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-44" target="qu7T5ML7RWsz-wKexjIw-60">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-44" value="Track Participants" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#ffe6cc;strokeColor=#d79b00;" vertex="1" parent="1">
<mxGeometry x="372" y="620" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-45" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=1;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-42" target="qu7T5ML7RWsz-wKexjIw-42">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="200" y="740" />
<mxPoint x="80" y="740" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-46" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=1;exitDx=0;exitDy=0;entryX=0;entryY=1;entryDx=0;entryDy=0;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-44" target="qu7T5ML7RWsz-wKexjIw-44">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="492" y="740" />
<mxPoint x="372" y="740" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-47" value="Entries on Clinical Trials.gov" style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="1">
<mxGeometry x="372" y="830" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-48" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-49" target="qu7T5ML7RWsz-wKexjIw-54">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-49" value="Close Trial" style="whiteSpace=wrap;html=1;shape=mxgraph.basic.octagon2;align=center;verticalAlign=middle;dx=15;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="534" y="600" width="100" height="100" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-50" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-51" target="qu7T5ML7RWsz-wKexjIw-44">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-51" value="Close Enrollment" style="shape=hexagon;perimeter=hexagonPerimeter2;whiteSpace=wrap;html=1;fixedSize=1;" vertex="1" parent="1">
<mxGeometry x="222" y="620" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-52" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;entryPerimeter=0;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-44" target="qu7T5ML7RWsz-wKexjIw-49">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-53" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=1;entryY=0.5;entryDx=0;entryDy=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;dashed=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-54" target="qu7T5ML7RWsz-wKexjIw-47">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-54" value="Final Reports" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#e1d5e7;strokeColor=#9673a6;" vertex="1" parent="1">
<mxGeometry x="672" y="620" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-55" value="&lt;b style=&quot;font-size: 16px;&quot;&gt;Clinical Trial Timeline&lt;br&gt;&lt;/b&gt;" style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;" vertex="1" parent="1">
<mxGeometry x="304" y="510" width="230" height="30" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-56" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;entryPerimeter=0;exitX=0.5;exitY=0;exitDx=0;exitDy=0;dashed=1;dashPattern=1 2;curved=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-51" target="qu7T5ML7RWsz-wKexjIw-49">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="282" y="560" />
<mxPoint x="584" y="560" />
</Array>
</mxGeometry>
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-58" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="qu7T5ML7RWsz-wKexjIw-57" target="qu7T5ML7RWsz-wKexjIw-42">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-57" value="Pre-Enrollment" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#fff2cc;strokeColor=#d6b656;" vertex="1" parent="1">
<mxGeometry x="80" y="520" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="qu7T5ML7RWsz-wKexjIw-60" value="Suspend" style="shape=rhombus;perimeter=rhombusPerimeter;whiteSpace=wrap;html=1;align=center;" vertex="1" parent="1">
<mxGeometry x="224" y="737" width="120" height="60" as="geometry" />
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 156 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 162 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 179 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 357 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 263 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 270 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 330 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 394 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 343 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 364 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 261 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save