Squashed commit of the following:
commitclaude_rewrite963293fc2bAuthor: Will King <will.king.git@youainti.com> Date: Mon Jan 13 21:15:44 2025 -0800 Added diagnostics appendix, notes in results commitd6d2360206Author: Will King <will.king.git@youainti.com> Date: Mon Jan 13 20:53:18 2025 -0800 Finally got all the needed images correct. Adjusted directory to make it easier to find images. commit37d35377b3Author: Will King <will.king.git@youainti.com> Date: Mon Jan 13 16:29:00 2025 -0800 Added more images to assets & included in results. commitbecefe15e0Author: Will King <will.king.git@youainti.com> Date: Mon Jan 13 12:56:15 2025 -0800 Updated images commit86f9b8dfc9Author: will king <youainti@protonmail.com> Date: Mon Jan 13 09:24:20 2025 -0800 finished drafting results commit64f3d14f7bAuthor: will king <youainti@protonmail.com> Date: Mon Jan 6 12:48:48 2025 -0800 Midday updates from writing commit1630af2928Author: will king <youainti@protonmail.com> Date: Tue Dec 3 17:08:43 2024 -0800 more updates commit5d9640ab8dAuthor: will king <youainti@protonmail.com> Date: Thu Nov 28 23:39:04 2024 -0800 saving work commit3e6a8f10d4Author: will king <youainti@protonmail.com> Date: Tue Nov 26 17:18:24 2024 -0800 tweaked econometrics presentation, added todos commit7d51cb10b3Author: will king <youainti@protonmail.com> Date: Tue Nov 26 15:57:50 2024 -0800 updated layout, added gitignore
@ -0,0 +1,5 @@
|
||||
*.pdf
|
||||
*.aux
|
||||
*.lof
|
||||
*.lot
|
||||
*.idx
|
||||
@ -1,119 +1,206 @@
|
||||
\documentclass[../Main.tex]{subfiles}
|
||||
|
||||
\begin{document}
|
||||
%\subsection{Data Exploration} %TODO: fill this out later.
|
||||
%look at trial
|
||||
\subsection{Model Fitting}
|
||||
In this section we examine the results from fitting the econometric model using
|
||||
mc-stan (\cite{mc-stan}) through the rstan (\cite{rstan}) interface.
|
||||
|
||||
%describe
|
||||
The model was based on the hierarchal logistic regression model
|
||||
presented in the Stan Users Guide (\cite{mc-stan}),
|
||||
and was run with 2,500 warmup iterations and
|
||||
2,500 sampling iterations in six chains.
|
||||
There were various issues, including 160 divergent transitions and the R-hat
|
||||
measure was 1.49.
|
||||
Overall these suggest that the econometric model is incorrect as
|
||||
written or requires reparameterization.
|
||||
%TODO: and info about how I learned about these diagnostics
|
||||
|
||||
|
||||
% \subsubsection{Diagnostics}
|
||||
% %Examine trank plots
|
||||
% To identify which parameters were problematic, I first looked at trace rank
|
||||
% histograms.
|
||||
% Under idea circumstances, each line (representing a chain) should exchange
|
||||
% places with the other lines frequently.
|
||||
% In both \cref{fig:mu_trank} and \cref{fig:sigma_trank}, most parameters seem
|
||||
% to mix well but there are a couple of exceptions.
|
||||
% This warrants further investigation.
|
||||
%
|
||||
% \begin{figure}[H]
|
||||
% \includegraphics[width=\textwidth]{../assets/img/mu_trank.png}
|
||||
% \caption{Trace Rank Histogram: Mu values}
|
||||
% \label{fig:mu_trank}
|
||||
% \end{figure}
|
||||
%
|
||||
% \begin{figure}[H]
|
||||
% \includegraphics[width=\textwidth]{../assets/img/sigma_trank.png}
|
||||
% \caption{Trace Rank Histogram: Sigma values}
|
||||
% \label{fig:sigma_trank}
|
||||
% \end{figure}
|
||||
%
|
||||
% %Take a look at batman and points for mu
|
||||
% In the case of the Mu values, a parallel coordinates plot
|
||||
% doesn't seem to indicate any parameters as likely candidates
|
||||
% for causing the issues with divergent transitions.
|
||||
% \begin{figure}[H]
|
||||
% \includegraphics[width=\textwidth]{../assets/img/mu_batman.png}
|
||||
% \caption{Parallel Coordinate Plot: Mu values}
|
||||
% \label{fig:mu_batman}
|
||||
% \end{figure}
|
||||
% Note that at each parameter, there is some level of dispersion between
|
||||
% values that diverged.
|
||||
%
|
||||
% On the other hand, in the parallel coordinates plot for sigma values,
|
||||
% it appears that most divergent transitions occur with values of
|
||||
% sigma[1], sigma[3], sigma[6], and sigma[7] close to zero.
|
||||
% \begin{figure}[H]
|
||||
% \includegraphics[width=\textwidth]{../assets/img/sigma_batman.png}
|
||||
% \caption{Parallel Coordinate Plot: Sigma values}
|
||||
% \label{fig:sigma_batman}
|
||||
% \end{figure}
|
||||
% Overall this suggests that there is an issue with the specification
|
||||
% of the covariance structures of the hyperparameters.
|
||||
%
|
||||
% Additional evidence that the covariance structure is incorrect comes from
|
||||
% plotting pairs of parameter values and examining the chains with divergent
|
||||
% transitions.
|
||||
%
|
||||
% \begin{figure}[H]
|
||||
% \includegraphics[width=\textwidth]{../assets/img/sigma_pairs_5-9.png}
|
||||
% \caption{Parameter Pairs plots: Sigma[5] through Sigma[9]}
|
||||
% \label{fig:sigma_pairs_5-9.png}
|
||||
% \end{figure}
|
||||
% From this we can see that divergent pairs are highly correlated with the cases
|
||||
% where sigma[6] or sigma[7] are equal to zero.
|
||||
% This has an impact on the shape of both of those estimated parameters, causing
|
||||
% both to be bimodal.
|
||||
|
||||
|
||||
\subsection{Interpretation}
|
||||
|
||||
The key results so far are related to the distribution of differences in $p$.
|
||||
|
||||
In figure \ref{fig:pred_dist_dif_delay} we see that there while most trials do not see any increased risk
|
||||
from a delay in closing enrollment, there is a small group that does experience this.
|
||||
In this section
|
||||
I describe the model fitting, the posteriors of the parameters of interest,
|
||||
and intepret the results.
|
||||
|
||||
|
||||
\subsection{Data Summaries and Estimation Procedure}
|
||||
|
||||
% Data Summaries
|
||||
Overall, I successfully processed 162 trials, with 1,347 snapshots between them.
|
||||
Figure \ref{fig:snapshot_counts} shows the histogram of snapshots per trial.
|
||||
Most trials lasted less than 1,500 days, as can be seen in
|
||||
\ref{fig:trial_durations}.
|
||||
Although there are a large number of snapshots that will be used to fit the
|
||||
model, the number of trials -- the unit of observation -- are quite low.
|
||||
Add to the fact that these are spread over multiple ICD-10 categories
|
||||
and the overall quantity of trials is quite low.
|
||||
|
||||
To continue, we can use a scatterplot to get a rough idea of the observed
|
||||
relationship between the number of snapshots and the duration of trials.
|
||||
We can see this in Figure \ref{fig:snapshot_duration_scatter}, where
|
||||
the correlation (measured at $0.34$) is apparent.
|
||||
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
|
||||
\caption{}
|
||||
\label{fig:pred_dist_diff_delay}
|
||||
\includegraphics[width=\textwidth]{../assets/img/trials_details/HistTrialDurations_Faceted}
|
||||
\todo{Replace this graphic with the histogram of trial durations}
|
||||
\caption{Histograms of Trial Durations}
|
||||
\label{fig:trial_durations}
|
||||
\end{figure}
|
||||
|
||||
Figure \ref{fig:pred_dist_dif_delay2} shows how this varies across disease categories
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay-group}
|
||||
\caption{}
|
||||
\label{fig:pred_dist_dif_delay2}
|
||||
\includegraphics[width=\textwidth]{../assets/img/trials_details/HistSnapshots}
|
||||
\todo{Replace this graphic with the histogram of snapshots}
|
||||
\caption{Histogram of the count of Snapshots}
|
||||
\label{fig:snapshot_counts}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/trials_details/SnapshotsVsDurationVsTermination}
|
||||
\todo{Replace this graphic with the scatterplot comparing durations and snapshots}
|
||||
\caption{Scatterplot comparing the Count of Snapshots and Trial Duration}
|
||||
\label{fig:snapshot_counts}
|
||||
\end{figure}
|
||||
|
||||
% Estimation Procedure
|
||||
I fit the econometric model using mc-stan
|
||||
\cite{standevelopmentteam_StanModelling_2022}
|
||||
through the rstan
|
||||
\cite{standevelopmentteam_RStanInterface_2023}
|
||||
interface using 4 chains with
|
||||
%describe
|
||||
2,500
|
||||
warmup iterations and
|
||||
2,500
|
||||
sampling iterations each.
|
||||
|
||||
Two of the chains experienced a low
|
||||
Estimated Baysian Fraction of Missing Information (E-BFMI) ,
|
||||
suggesting that there are some parts of the posterior distribution
|
||||
that were not explored well during the model fitting.
|
||||
I presume this is due to the low number of trials in some of the
|
||||
ICD-10 categories.
|
||||
We can see in Figure \ref{fig:barchart_idc_categories} that some of these
|
||||
disease categories had a single trial represented while others were
|
||||
not represented at all.
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/trials_details/CategoryCounts}
|
||||
\caption{Bar chart of trials by ICD-10 categories}
|
||||
\label{fig:barchart_idc_categories}
|
||||
\end{figure}
|
||||
|
||||
We can also examine the direct effect from adding a single generic competitior drug.
|
||||
|
||||
\subsection{Primary Results}
|
||||
|
||||
The primary, causally-identified value we can estimate is the change in
|
||||
the probability of termination caused by (counterfactually) keeping enrollment
|
||||
open instead of closing enrollment when observed.
|
||||
In figure \ref{fig:pred_dist_diff_delay} below, we see this impact of
|
||||
keeping enrollment open.
|
||||
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-generic}
|
||||
\caption{}
|
||||
\label{fig:pred_dist_diff_generic}
|
||||
\includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_boxplot}
|
||||
\todo{Replace this graphic with the histdiff with boxplot}
|
||||
\small{
|
||||
Values near 1 indicate a near perfect increase in the probability
|
||||
of termination.
|
||||
Values near 0 indicate little change in probability,
|
||||
while values near -1, represent a decrease in the probability
|
||||
of termination.
|
||||
The scale is in probability points, thus a value near 1 is a change
|
||||
from unlikely to terminate under control, to highly likely to
|
||||
terminate.
|
||||
}
|
||||
\caption{Histogram of the Distribution of Predicted Differences}
|
||||
\label{fig:pred_dist_diff_delay}
|
||||
\end{figure}
|
||||
|
||||
Figure \ref{fig:pred_dist_dif_generic2} shows how this varies across disease categories
|
||||
There are a few interesting things to point out here.
|
||||
Let's start by getting aquainted with the details of the distribution above.
|
||||
% - spike at 0
|
||||
% - the boxplot
|
||||
% - 63% of mass below 0 : find better way to say that
|
||||
% - For a random trial, there is a 63% chance that the impact is to reduce the probability of a termination.
|
||||
% - 2 pctg-point wide band centered on 0 has ~13% of the masss
|
||||
% - mean represents 9.x% increase in probability of termination. A quick simulation gives about the same pctg-point increase in terminated trials.
|
||||
|
||||
A few interesting interpretation bits come out of this.
|
||||
% - there are 3 regimes: low impact (near zero), medium impact (concentrated in decreased probability of termination), and high impact (concentrated in increased probability of termination).
|
||||
The first this that there appear to be three different regimes.
|
||||
The first regime consists of the low impact results, i.e. those values of $\delta_p$
|
||||
near zero.
|
||||
About 13\% of trials lie within a single percentage point change of zero,
|
||||
suggesting that there is a reasonable chance that delaying
|
||||
a close of enrollment has no impact.
|
||||
The second regime consists of the moderate impact on clinical trials'
|
||||
probabilities of termination, say values in the interval $[-0.5, 0.5]$
|
||||
on the graph.
|
||||
Most of this probability mass is represents a decrease in the probability of
|
||||
a termination, some of it rather large.
|
||||
Finally, there exists the high impact region, almost exclusively concentrated
|
||||
around increases in the probability of termination at $\delta_p > 0.75$.
|
||||
These represent cases where delaying the close of enrollemnt changes a trial
|
||||
from a case where they were highly likely to complete their primary objectives to
|
||||
a case where they were likely or almost certain to terminate the trial early.
|
||||
% - the high impact regime is strange because it consists of trials that moved from unlikely (<20% chance) of termination to a high chance (>80% chance) of termination. Something like 5% of all trials have a greater than 98 percentage point increase in termination. Not sure what this is doing.
|
||||
|
||||
% - Potential Explanations for high impact regime:
|
||||
How could this intervention have such a wide range in the intensity
|
||||
and direction of impacts?
|
||||
A few explanations include that some trials are suceptable or that this is a
|
||||
result of too little data.
|
||||
% - Some trials are highly suceptable. This is the face value effect
|
||||
One option is that some categories are more suceptable to
|
||||
issues with participant enrollment.
|
||||
If this is the case, we should be able to isolate categories that contribute
|
||||
the most to this effect.
|
||||
Another is that this might be a modelling artefact, due to the relatively
|
||||
low number of trials in certain ICD-10 categories.
|
||||
In short, there might be high levels of uncertanty in some parameter values,
|
||||
which manifest as fat tails in the distributions of the $\beta$ parameters.
|
||||
Because of the logistic format of the model, these fat tails lead to
|
||||
extreme values of $p$, and potentally large changes $\delta_p$.
|
||||
% - Could be uncertanty. If the model is highly uncertain, e.g. there isn't enough data, we could have a small percentage of large increases. This could be in general or just for a few categories with low amounts of data.
|
||||
% -
|
||||
% -
|
||||
|
||||
I believe that this second explanation -- a model artifact due to uncertanty --
|
||||
is likely to be the cause.
|
||||
Three points lead me to believe this:
|
||||
\begin{itemize}
|
||||
\item The low fractions of E-BFMI suggest that the sampler is struggling
|
||||
to explore some regions of the posterior.
|
||||
According to \cite{standevelopmentteam_RuntimeWarnings_2022} this is
|
||||
often due to thick tails of posterior distributions.
|
||||
\item When we examine the results across different ICD-10 groups,
|
||||
\ref{fig:pred_dist_dif_delay2}
|
||||
\todo{move figure from below}
|
||||
we note this same issue.
|
||||
\item In Figure \ref{fig:betas_delay}, we see that some some ICD-10 categories
|
||||
\todo{add figure}
|
||||
have \todo{note fat tails}.
|
||||
\item There are few trials available, particularly among some specific
|
||||
ICD-10 categories.
|
||||
\end{itemize}
|
||||
% - take a look at beta values and then discuss if that lines up with results from dist-diff by group.
|
||||
% - My initial thought is that there is not enough data/too uncertain. I think this because it happens for most/all of the categories.
|
||||
% -
|
||||
% -
|
||||
% -
|
||||
Overally it is hard to escape the conclusion that more data is needed across
|
||||
many -- if not all -- of the disease categories.
|
||||
|
||||
|
||||
|
||||
Figure \ref{fig:pred_dist_dif_delay2} shows how this overall
|
||||
result comes from different disease categories.
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-generic-group}
|
||||
\caption{}
|
||||
\label{fig:pred_dist_dif_generic2}
|
||||
\includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_by_group}
|
||||
\caption{Distribution of Predicted differences by Disease Group}
|
||||
\label{fig:pred_dist_dif_delay2}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\subsection{Secondary Results}
|
||||
|
||||
% Examine beta parameters
|
||||
% - Little movement except where data is strong, general negative movement. Still really wide
|
||||
% - Note how they all learned (partial pooling) reduction in \beta from ANR?
|
||||
% - Need to discuss the 5 different states. Can't remember which one is dropped for the life of me. May need to fix parameterization.
|
||||
% -
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{../assets/img/betas/parameter_across_groups/parameters_12_status_ANR}
|
||||
\caption{Distribution of parameters associated with ``Active, not recruiting'' status, by ICD-10 Category}
|
||||
\label{fig:parameters_ANR_by_group}
|
||||
\end{figure}
|
||||
% -
|
||||
|
||||
|
||||
\end{document}
|
||||
|
||||
@ -0,0 +1,38 @@
|
||||
\documentclass[../Main.tex]{subfiles}
|
||||
\graphicspath{{\subfix{Assets/img/}}}
|
||||
|
||||
\begin{document}
|
||||
\subsection{Diagnostics}
|
||||
Reported low E-BFMI scores (low is considered below $0.2$)
|
||||
\todo{Fill these out based on what is in the rendered report.}
|
||||
\begin{itemize}
|
||||
\item Chain 1: 0.178
|
||||
\item Chain 2: 0.189
|
||||
\end{itemize}
|
||||
|
||||
No other reported issues.
|
||||
|
||||
\begin{figure}[H]
|
||||
\caption{Diagnostics: Trace Rank Plots - $\mu$}
|
||||
\label{fig:trial_durations}
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_mu_1-4}
|
||||
\\[\smallskipamount]
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_mu_5-8}
|
||||
\\[\smallskipamount]
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_mu_9-12}
|
||||
\end{figure}
|
||||
Mixing seems to be fine
|
||||
|
||||
\begin{figure}[H]
|
||||
\caption{Diagnostics: Trace Rank Plots - $\sigma$}
|
||||
\label{fig:trial_durations}
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_sigma_1-4}
|
||||
\\[\smallskipamount]
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_sigma_5-8}
|
||||
\\[\smallskipamount]
|
||||
\includegraphics[width=\textwidth]{../assets/img/diagnostics/trace_rank_plot_sigma_9-12}
|
||||
\end{figure}
|
||||
Mixing is slower than $\mu$ values, but doesn't seem too problematic in light of
|
||||
other deficencies such as low number of observations.
|
||||
|
||||
\end{document}
|
||||
|
Before Width: | Height: | Size: 57 KiB |
|
Before Width: | Height: | Size: 357 KiB |
|
Before Width: | Height: | Size: 263 KiB |
|
Before Width: | Height: | Size: 270 KiB |
|
Before Width: | Height: | Size: 330 KiB |
|
Before Width: | Height: | Size: 394 KiB |
|
Before Width: | Height: | Size: 343 KiB |
|
Before Width: | Height: | Size: 364 KiB |
|
Before Width: | Height: | Size: 261 KiB |
|
Before Width: | Height: | Size: 78 KiB |
|
Before Width: | Height: | Size: 94 KiB |
|
Before Width: | Height: | Size: 72 KiB |
|
Before Width: | Height: | Size: 78 KiB |
|
Before Width: | Height: | Size: 97 KiB |
|
Before Width: | Height: | Size: 75 KiB |
|
Before Width: | Height: | Size: 192 KiB |
|
Before Width: | Height: | Size: 97 KiB |
|
Before Width: | Height: | Size: 169 KiB |
|
Before Width: | Height: | Size: 195 KiB |
|
Before Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 172 KiB |
|
Before Width: | Height: | Size: 75 KiB |
|
Before Width: | Height: | Size: 79 KiB |
|
Before Width: | Height: | Size: 70 KiB |
|
Before Width: | Height: | Size: 82 KiB |
|
Before Width: | Height: | Size: 60 KiB |
|
Before Width: | Height: | Size: 65 KiB |
|
Before Width: | Height: | Size: 118 KiB |
|
Before Width: | Height: | Size: 259 KiB |
|
Before Width: | Height: | Size: 268 KiB |
|
Before Width: | Height: | Size: 258 KiB |
|
Before Width: | Height: | Size: 75 KiB |
|
Before Width: | Height: | Size: 77 KiB |
|
Before Width: | Height: | Size: 74 KiB |
|
Before Width: | Height: | Size: 230 KiB |
|
Before Width: | Height: | Size: 80 KiB |
|
Before Width: | Height: | Size: 164 KiB |
|
Before Width: | Height: | Size: 261 KiB |
|
Before Width: | Height: | Size: 78 KiB |
|
Before Width: | Height: | Size: 164 KiB |
|
Before Width: | Height: | Size: 71 KiB |
|
Before Width: | Height: | Size: 79 KiB |
|
Before Width: | Height: | Size: 81 KiB |
|
Before Width: | Height: | Size: 71 KiB |
|
After Width: | Height: | Size: 399 KiB |
|
After Width: | Height: | Size: 361 KiB |
|
After Width: | Height: | Size: 379 KiB |
|
After Width: | Height: | Size: 366 KiB |
|
After Width: | Height: | Size: 296 KiB |
|
After Width: | Height: | Size: 292 KiB |
|
After Width: | Height: | Size: 342 KiB |
|
After Width: | Height: | Size: 341 KiB |
|
After Width: | Height: | Size: 290 KiB |
|
After Width: | Height: | Size: 218 KiB |
|
After Width: | Height: | Size: 217 KiB |
|
After Width: | Height: | Size: 207 KiB |
|
After Width: | Height: | Size: 198 KiB |
|
After Width: | Height: | Size: 220 KiB |
|
After Width: | Height: | Size: 217 KiB |
|
After Width: | Height: | Size: 219 KiB |
|
After Width: | Height: | Size: 206 KiB |
|
After Width: | Height: | Size: 201 KiB |
|
After Width: | Height: | Size: 217 KiB |
|
After Width: | Height: | Size: 212 KiB |
|
After Width: | Height: | Size: 205 KiB |
|
After Width: | Height: | Size: 198 KiB |
|
After Width: | Height: | Size: 219 KiB |
|
After Width: | Height: | Size: 217 KiB |
|
Before Width: | Height: | Size: 111 KiB |
|
Before Width: | Height: | Size: 61 KiB |
|
Before Width: | Height: | Size: 124 KiB |
|
Before Width: | Height: | Size: 62 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |