JobMarketPaper/Paper/sections/06_Results.tex

\documentclass[../Main.tex]{subfiles}

\begin{document}

In this section
I describe the model fitting, the posteriors of the parameters of interest,
and intepret the results.


\subsection{Data Summaries and Estimation Procedure}

% Data Summaries
Overall, I successfully processed 162 trials, with 1,347 snapshots between them.
Figure \ref{fig:snapshot_counts} shows the histogram of snapshots per trial.
Most trials lasted less than 1,500 days, as can be seen in
\ref{fig:trial_durations}.
Although there are a large number of snapshots that will be used to fit the
model, the number of trials -- the unit of observation -- are quite low.
Add to the  fact that these are spread over multiple ICD-10 categories
and the overall quantity of trials is quite low.

To continue, we can use a scatterplot to get a rough idea of the observed
relationship between the number of snapshots and the duration of trials.
We can see this in Figure \ref{fig:snapshot_duration_scatter}, where
the correlation (measured at $0.34$) is apparent.

\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/trials_details/HistSnapshots}
    \caption{Histogram of the count of Snapshots}
    \label{fig:snapshot_counts}
\end{figure}

\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/trials_details/HistTrialDurations_Faceted}
    \caption{Histograms of Trial Durations}
    \label{fig:trial_durations}
\end{figure}

\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/trials_details/SnapshotsVsDurationVsTermination}
    \caption{Scatterplot comparing the Count of Snapshots and Trial Duration}
    \label{fig:snapshot_duration_scatter}
\end{figure}

% Estimation Procedure
I fit the econometric model using mc-stan
\cite{standevelopmentteam_stanmodellingusersguide_2022}
through the rstan
\cite{standevelopmentteam_rstaninterfacestan_2023}
interface using 4 chains with
%describe
2,500
warmup iterations and
2,500
sampling iterations each.

Two of the chains experienced a low
Estimated Baysian Fraction of Missing Information (E-BFMI) ,
suggesting that there are some parts of the posterior distribution
that were not explored well during the model fitting
\cite{standevelopmentteam_runtimewarningsconvergence_2022}.
I presume this is due to the low number of trials in some of the
ICD-10 categories.
We can see in Figure \ref{FIG:barchart_idc_categories} that some of these
disease categories had a single trial represented while others were
not represented at all.

\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/trials_details/CategoryCounts}
    \caption{Bar chart of trials by ICD-10 categories}
    \label{FIG:barchart_idc_categories}
\end{figure}


\subsection{Primary Results}

The primary, causally-identified value we can estimate is the change in
the probability of termination caused by (counterfactually) keeping enrollment
open instead of closing enrollment when observed.
In figure \ref{fig:pred_dist_diff_delay} below, we see this impact of
keeping enrollment open.

% \begin{minipage}{\textwidth}
\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_boxplot}
    \small{
        Values near 1 indicate a near perfect increase in the probability
        of termination.
        Values near 0 indicate little change in probability,
        while values near -1, represent a decrease in the probability
        of termination.
        The scale is in probability points, thus a value near 1 is a change
        from unlikely to terminate under control, to highly likely to
        terminate.
    }
    \caption{Histogram of the Distribution of Predicted Differences}
    \label{fig:pred_dist_diff_delay}
\end{figure}

\begin{table}[H]
    \centering
    \caption{Boxplot Summary Statistics: percentage point due to intervention}
    \label{table:boxplotsummary}
    \begin{tabular}{ | c c c c c c c c | }
        \hline
        5th & 10th & 25th & median &
        75th & 90th & 95th & mean \\
        \hline
        -2.1 & -0.8 & 0.0 & 1.2 &
         4.2 &  8.2 & 11.0  & 2.5 \\
        \hline
    \end{tabular}
\end{table}

% \end{minipage}

The key figures from the boxplot in figure
\ref{fig:pred_dist_diff_delay}
are sumarized in table \ref{table:boxplotsummary}
There are a few interesting things to point out here.
First, over 75\% of the probability mass is equal to or above zero, suggesting
that most trials will experience some harm from a delay in closing enrollment.
Seconds, about 39.1\% of the probability mass is contained within the interval
[-0.01,0.01].
The full 5\% percentile table can be found in  table
\ref{TABLE:PercentilesOfDistributionOfDifferences}
\todo{fix table}
in appendix
\ref{Appendix:Results}


Figure \ref{fig:pred_dist_dif_delay2} shows how the different disease categories
tend to have a similar results:
\begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_by_group}
    \caption{Distribution of Predicted differences by Disease Group}
    \label{fig:pred_dist_dif_delay2}
\end{figure}

% Continuing to the $\beta$ parameters in figure
% \ref{fig:parameters_ANR_by_group},
% we can see the estimated distributions
% the status: \textbf{Recruiting}.
% %TOFIX: Discuss how this is a fixed effect with no comparator,i.e. compared
% %to the "average" conditions it is an "increase/decrease" in the probability of termination.
% This xxx in the probability of termination is strongest in the categories of Neoplasms ($n=49$),
% Musculoskeletal diseases ($n=17$), and Infections and Parasites ($n=20$), the three categories with the most data.
% % As this is a comparison against the trial status XXX, we note that YYY.
% % \todo{The natural comparison I want to make is against the Recruting status. Do I want to redo this so that I can read that directly?It shouldn't affect the $\delta_p$ analysis, but this could probably use it. YES, THIS UPDATE NEEDS TO HAPPEN. The base needs to be ``active not recruiting.''}
% Overall, this is consistent with the result that extending a clinical trial's enrollment period will reduce the probability of termination.
%
% \begin{figure}[H]
%     \includegraphics[width=\textwidth]{../assets/img/betas/parameter_across_groups/parameters_12_status_ANR}
%     \caption{Distribution of parameters associated with ``Active, not recruiting'' status, by ICD-10 Category}
%     \label{fig:parameters_ANR_by_group}
% \end{figure}
% % -


%   - Potential Explanations for high impact regime:
% This leads to the question:
% ``How could this intervention have such a wide range in the intensity
% and direction of impacts?''
% The most likely explanations in my mind are that either
% some trials are highly suceptable to enrollment struggles or that this is a
% modelling artifact.
% %       - Some trials are highly suceptable. This is the face value effect
% The first option -- that some trials are more suceptable to
% issues with participant enrollment -- should allow us to
% isolate categories or trials that contribute the most to this effect.
% This is not what we find when we inspect the categories
% in figure
% \ref{fig:pred_dist_dif_delay2}.
% Instead it appears that most of the categories have this high
% impact regime when $\delta_p > 0.75$, although the maximum value
% of this regime varies considerably.
%
% Another explanation is that this is a modelling artefact due to priors
% with strong tails and the relatively low number of trials in
% each ICD-10 categories.
% In short, there might be high levels of uncertanty in some parameter values,
% which manifest as fat tails in the distributions of the $\beta$ parameters.
% Because of the logistic format of the model, these fat tails lead to
% extreme values of $p$, and potentally large changes $\delta_p$.
% I believe that this second explanation -- a model artifact due to uncertanty --
% is likely to be the cause.
% A few things lead me to believe this:
% \begin{itemize}
%     \item The low fractions of E-BFMI suggest that the sampler is struggling
%         to explore some regions of the posterior.
%         According to
%         \cite{standevelopmentteam_runtimewarningsconvergence_2022}
%         this is
%         often due to thick tails of posterior distributions.
%         During earlier analysis, when I had about 100 trials, the number of
%         warnings was significantly higher.
%     \item When we examine the results across different ICD-10 category,
%         \ref{fig:pred_dist_dif_delay2}
%         we note that most categories have the same upper tail spike.
%     \item In Figure
%         % \ref{fig:betas_delay},
%         \ref{fig:parameters_ANR_by_group},
%         we see that most ICD-10 categories
%         have fat tails in the $\beta$s, even among the categories
%         relatively larger sample sizes.
% \end{itemize}
%
% Overally it is hard to escape the conclusion that more data is needed across
% many -- if not all -- of the disease categories.
% At the same time, the median result is a decrease in the probability
% of termination when the enrollment period is held open.
% My inclination is to believe that the overall effect is to reduce the
% probability of termination.


\end{document}