You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
JobMarketPaper/Paper/sections/06_Results.tex

274 lines
12 KiB
TeX

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

\documentclass[../Main.tex]{subfiles}
\begin{document}
In this section
I describe the model fitting, the posteriors of the parameters of interest,
and intepret the results.
\subsection{Data Summaries and Estimation Procedure}
% Data Summaries
Overall, I successfully processed 162 trials, with 1,347 snapshots between them.
Figure \ref{fig:snapshot_counts} shows the histogram of snapshots per trial.
Most trials lasted less than 1,500 days, as can be seen in
\ref{fig:trial_durations}.
Although there are a large number of snapshots that will be used to fit the
model, the number of trials -- the unit of observation -- are quite low.
Add to the fact that these are spread over multiple ICD-10 categories
and the overall quantity of trials is quite low.
To continue, we can use a scatterplot to get a rough idea of the observed
relationship between the number of snapshots and the duration of trials.
We can see this in Figure \ref{fig:snapshot_duration_scatter}, where
the correlation (measured at $0.34$) is apparent.
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/trials_details/HistSnapshots}
\caption{Histogram of the count of Snapshots}
\label{fig:snapshot_counts}
\end{figure}
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/trials_details/HistTrialDurations_Faceted}
\caption{Histograms of Trial Durations}
\label{fig:trial_durations}
\end{figure}
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/trials_details/SnapshotsVsDurationVsTermination}
\caption{Scatterplot comparing the Count of Snapshots and Trial Duration}
\label{fig:snapshot_duration_scatter}
\end{figure}
% Estimation Procedure
I fit the econometric model using mc-stan
\cite{standevelopmentteam_stanmodellingusersguide_2022}
through the rstan
\cite{standevelopmentteam_rstaninterfacestan_2023}
interface using 4 chains with
%describe
2,500
warmup iterations and
2,500
sampling iterations each.
Two of the chains experienced a low
Estimated Baysian Fraction of Missing Information (E-BFMI) ,
suggesting that there are some parts of the posterior distribution
that were not explored well during the model fitting
\cite{standevelopmentteam_runtimewarningsconvergence_2022}.
I presume this is due to the low number of trials in some of the
ICD-10 categories.
We can see in Figure \ref{FIG:barchart_idc_categories} that some of these
disease categories had a single trial represented while others were
not represented at all.
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/trials_details/CategoryCounts}
\caption{Bar chart of trials by ICD-10 categories}
\label{FIG:barchart_idc_categories}
\end{figure}
\subsection{Primary Results}
The primary, causally-identified value we can estimate is the change in
the probability of termination caused by (counterfactually) keeping enrollment
open instead of closing enrollment when observed.
In figure \ref{fig:pred_dist_diff_delay} below, we see this impact of
keeping enrollment open.
% \begin{minipage}{\textwidth}
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_boxplot}
\small{
Values near 1 indicate a near perfect increase in the probability
of termination.
Values near 0 indicate little change in probability,
while values near -1, represent a decrease in the probability
of termination.
The scale is in probability points, thus a value near 1 is a change
from unlikely to terminate under control, to highly likely to
terminate.
}
\caption{Histogram of the Distribution of Predicted Differences}
\label{fig:pred_dist_diff_delay}
\end{figure}
\begin{table}[H]
\centering
\caption{Boxplot Summary Statistics}
\label{table:boxplotsummary}
\begin{tabular}{ | c c c c c c c c | }
\hline
5th & 10th & 25th & median &
75th & 90th & 95th & mean \\
\hline
-0.376 & -0.264 & -0.129 & -0.023 &
0.145 & 0.925 & 0.982 & 0.096 \\
\hline
\end{tabular}
\end{table}
% \end{minipage}
The key figures from the boxplot in figure
\ref{fig:pred_dist_diff_delay}
are sumarized in table \ref{table:boxplotsummary}
There are a few interesting things to point out here.
Let's start by getting aquainted with the details of the distribution above.
A couple more points
First, 63\% of the probability mass is equal to or below zero.
Seconds, about 13\% of the probability mass is contained within the interval
[-0.01,0.01].
The full 5\% percentile table can be found in table
\ref{TABLE:PercentilesOfDistributionOfDifferences}
in appendix
\ref{Appendix:Results}
It can also be devided into a few different regimes.
% - spike at 0
% - the boxplot
% - 63% of mass below 0 : find better way to say that
% - For a random trial, there is a 63% chance that the impact is to reduce the probability of a termination.
% - 2 pctg-point wide band centered on 0 has ~13% of the masss
% - mean represents 9.x% increase in probability of termination. A quick simulation gives about the same pctg-point increase in terminated trials.
% - there are 3 regimes: low impact (near zero), medium impact (concentrated in decreased probability of termination), and high impact (concentrated in increased probability of termination).
The first regime consists of the low impact results, i.e. those values of $\delta_p$
near zero.
About 13\% of trials lie within a single percentage point change of zero,
suggesting that there is a reasonable chance that delaying
a close of enrollment has no impact.
The second regime consists of the moderate impact on clinical trials'
probabilities of termination, say values in the interval $[-0.5, 0.5]$
on the graph.
Most of this probability mass is represents a decrease in the probability of
a termination, some of it rather large decreases.
The third regime consists of the high impact region,
almost exclusively concentrated above increases in the probability of
termination $\delta_p > 0.75$.
These represent cases where delaying the close of enrollemnt changes a trial
from a case where they were highly likely to complete their primary objectives to
a case where they were likely or almost certain to terminate the trial early.
% - the high impact regime is strange because it consists of trials that moved from unlikely (<20% chance) of termination to a high chance (>80% chance) of termination. Something like 5% of all trials have a greater than 98 percentage point increase in termination. Not sure what this is doing.
% Looking at the spike around zero, we find that 13.09% of the probability mass
% is contained within the band from [-1,1].
% Additionally, there was 33.4282738% of the probability above that
% representing those with a general increase in the
% probability of termination and 53.4817262% of the probability mass
% below the band representing a decrease in the probability of termination.
% On average, if you keep the trial open instead of closing it, 0.6337363% of
% trials will see a decrease in the probability of termination, but, due to
% the high increase in probability of termination given termination was
% increased, the mean probability of termination increases by 0.0964726.
% Pulled the data from the report
% ```{r}
% summary(pddf_ib$value)
% Min. 1st Qu. Median Mean 3rd Qu. Max.
% -0.99850 -0.12919 -0.02259 0.09647 0.14531 1.00000
% quants <- quantile(pddf_ib$value, probs = seq(0,1,0.05), type=4)
% # Convert to a data frame
% quant_df <- data.frame( Percentile = names(quants), Value = quants )
% kable(quant_df)
% Percentile Value
% SEE TABLE IN APPENDIX
%```
Figure \ref{fig:pred_dist_dif_delay2} shows how the different disease categories
tend to have a similar results:
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_by_group}
\caption{Distribution of Predicted differences by Disease Group}
\label{fig:pred_dist_dif_delay2}
\end{figure}
Again, note the high mass near zero, the general decrease in the probability
of termination, and then the strong upper tails.
Continuing to the $\beta$ parameters in figure
\ref{fig:parameters_ANR_by_group},
we can see the estimated distributions
the status: \textbf{Active, not recruiting}.
The prior distributions were centered on zero, but we can see that the
pooled learning has moved the mean
values negative, representing reductions in the probability of termination
across the board.
This decrease in the probability of termination is strongest in the categories of Neoplasms ($n=49$),
Musculoskeletal diseases ($n=17$), and Infections and Parasites ($n=20$), the three categories with the most data.
As this is a comparison against the trial status XXX, we note that YYY.
\todo{The natural comparison I want to make is against the Recruting status. Do I want to redo this so that I can read that directly?It shouldn't affect the $\delta_p$ analysis, but this could probably use it. YES, THIS UPDATE NEEDS TO HAPPEN. The base needs to be ``active not recruiting.''}
Overall, this is consistent with the result that extending a clinical trial's enrollment period will reduce the probability of termination.
\begin{figure}[H]
\includegraphics[width=\textwidth]{../assets/img/betas/parameter_across_groups/parameters_12_status_ANR}
\caption{Distribution of parameters associated with ``Active, not recruiting'' status, by ICD-10 Category}
\label{fig:parameters_ANR_by_group}
\end{figure}
% -
% - Potential Explanations for high impact regime:
This leads to the question:
``How could this intervention have such a wide range in the intensity
and direction of impacts?''
The most likely explanations in my mind are that either
some trials are highly suceptable to enrollment struggles or that this is a
modelling artifact.
% - Some trials are highly suceptable. This is the face value effect
The first option -- that some trials are more suceptable to
issues with participant enrollment -- should allow us to
isolate categories or trials that contribute the most to this effect.
This is not what we find when we inspect the categories
in figure
\ref{fig:pred_dist_dif_delay2}.
Instead it appears that most of the categories have this high
impact regime when $\delta_p > 0.75$, although the maximum value
of this regime varies considerably.
Another explanation is that this is a modelling artefact due to priors
with strong tails and the relatively low number of trials in
each ICD-10 categories.
In short, there might be high levels of uncertanty in some parameter values,
which manifest as fat tails in the distributions of the $\beta$ parameters.
Because of the logistic format of the model, these fat tails lead to
extreme values of $p$, and potentally large changes $\delta_p$.
I believe that this second explanation -- a model artifact due to uncertanty --
is likely to be the cause.
A few things lead me to believe this:
\begin{itemize}
\item The low fractions of E-BFMI suggest that the sampler is struggling
to explore some regions of the posterior.
According to
\cite{standevelopmentteam_runtimewarningsconvergence_2022}
this is
often due to thick tails of posterior distributions.
During earlier analysis, when I had about 100 trials, the number of
warnings was significantly higher.
\item When we examine the results across different ICD-10 category,
\ref{fig:pred_dist_dif_delay2}
we note that most categories have the same upper tail spike.
\item In Figure
% \ref{fig:betas_delay},
\ref{fig:parameters_ANR_by_group},
we see that most ICD-10 categories
have fat tails in the $\beta$s, even among the categories
relatively larger sample sizes.
\end{itemize}
Overally it is hard to escape the conclusion that more data is needed across
many -- if not all -- of the disease categories.
At the same time, the median result is a decrease in the probability
of termination when the enrollment period is held open.
My inclination is to believe that the overall effect is to reduce the
probability of termination.
\end{document}