From 86f9b8dfc9f621b069ac83dab9f2d8c053ab789c Mon Sep 17 00:00:00 2001 From: will king Date: Mon, 13 Jan 2025 09:24:20 -0800 Subject: [PATCH] finished drafting results --- Paper/sections/06_Results.tex | 171 +++++++++++++++----- Paper/sections/08_PotentialImprovements.tex | 86 +++++----- 2 files changed, 176 insertions(+), 81 deletions(-) diff --git a/Paper/sections/06_Results.tex b/Paper/sections/06_Results.tex index 6eb2f02..9ae780a 100644 --- a/Paper/sections/06_Results.tex +++ b/Paper/sections/06_Results.tex @@ -7,24 +7,73 @@ I describe the model fitting, the posteriors of the parameters of interest, and intepret the results. -\subsection{Estimation Procedure} +\subsection{Data Summaries and Estimation Procedure} + +% Data Summaries +Overall, I successfully processed 162 trials, with 1,347 snapshots between them. +Figure \ref{fig:snapshot_counts} shows the histogram of snapshots per trial. +Most trials lasted less than 1,500 days, as can be seen in +\ref{fig:trial_durations}. +Although there are a large number of snapshots that will be used to fit the +model, the number of trials -- the unit of observation -- are quite low. +Add to the fact that these are spread over multiple IDC-10 categories +and the overall quantity of trials is quite low. + +To continue, we can use a scatterplot to get a rough idea of the observed +relationship between the number of snapshots and the duration of trials. +We can see this in Figure \ref{fig:snapshot_duration_scatter}, where +the correlation (measured at $0.34$) is apparent. + + +\begin{figure}[H] + \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay} + \todo{Replace this graphic with the histogram of trial durations} + \caption{Histograms of Trial Durations} + \label{fig:trial_durations} +\end{figure} + +\begin{figure}[H] + \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay} + \todo{Replace this graphic with the histogram of snapshots} + \caption{Histogram of the count of Snapshots} + \label{fig:snapshot_counts} +\end{figure} + +\begin{figure}[H] + \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay} + \todo{Replace this graphic with the scatterplot comparing durations and snapshots} + \caption{Scatterplot comparing the Count of Snapshots and Trial Duration} + \label{fig:snapshot_counts} +\end{figure} + +% Estimation Procedure I fit the econometric model using mc-stan \cite{standevelopmentteam_StanModelling_2022} through the rstan \cite{standevelopmentteam_RStanInterface_2023} -interface. - -I had X Trials with X snapshots in total. \todo{Fill out.} - +interface using 4 chains with %describe -X\todo{UPDATE VALUES} +2,500 warmup iterations and -X\todo{UPDATE VALUES} -sampling iterations in six chains. +2,500 +sampling iterations each. + +Two of the chains experienced a low +Estimated Baysian Fraction of Missing Information (E-BFMI) , +suggesting that there are some parts of the posterior distribution +that were not explored well during the model fitting. +I presume this is due to the low number of trials in some of the +IDC-10 categories. +We can see in Figure \ref{fig:barchart_idc_categories} that some of these +disease categories had a single trial represented while others were +not represented at all. -% \subsection{Data Exploration} -% \todo{fill this out later.} -%look at trial +\begin{figure}[H] + \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay} + \todo{Replace this graphic with the barchart of trials by categories.} + \caption{Bar chart of trials by IDC-10 categories} + \label{fig:barchart_idc_categories} +\end{figure} \subsection{Primary Results} @@ -38,6 +87,7 @@ keeping enrollment open. \begin{figure}[H] \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay} + \todo{Replace this graphic with the histdiff with boxplot} \small{ Values near 1 indicate a near perfect increase in the probability of termination. @@ -52,26 +102,84 @@ keeping enrollment open. \label{fig:pred_dist_diff_delay} \end{figure} -We can see from figure -\ref{fig:pred_dist_diff_delay} -That there are roughly four regimes. -The first consists of trials that experiences nearly no effect, -i.e. have values near zero. -Trials in the second regime experience a mild to large reduction in -the probability of termination, with X percent of the probability mass -between about 5 percentage points and 50 percentage point reductions. -The third regime is those trials that experience a mild to large -increase in the probability of termination, -from an increase o 5 percentage points to about 75 percentage points. -The fourth and final regime is the X\% of trials that experience a significant -(greater than 75 percentage point) increase in the probability of -termination. -%Notes on interpretation -% - increase vs decrease on graph +There are a few interesting things to point out here. +Let's start by getting aquainted with the details of the distribution above. +% - spike at 0 +% - the boxplot +% - 63% of mass below 0 : find better way to say that +% - For a random trial, there is a 63% chance that the impact is to reduce the probability of a termination. +% - 2 pctg-point wide band centered on 0 has ~13% of the masss +% - mean represents 9.x% increase in probability of termination. A quick simulation gives about the same pctg-point increase in terminated trials. + +A few interesting interpretation bits come out of this. +% - there are 3 regimes: low impact (near zero), medium impact (concentrated in decreased probability of termination), and high impact (concentrated in increased probability of termination). +The first this that there appear to be three different regimes. +The first regime consists of the low impact results, i.e. those values of $\delta_p$ +near zero. +About 13\% of trials lie within a single percentage point change of zero, +suggesting that there is a reasonable chance that delaying +a close of enrollment has no impact. +The second regime consists of the moderate impact on clinical trials' +probabilities of termination, say values in the interval $[-0.5, 0.5]$ +on the graph. +Most of this probability mass is represents a decrease in the probability of +a termination, some of it rather large. +Finally, there exists the high impact region, almost exclusively concentrated +around increases in the probability of termination at $\delta_p > 0.75$. +These represent cases where delaying the close of enrollemnt changes a trial +from a case where they were highly likely to complete their primary objectives to +a case where they were likely or almost certain to terminate the trial early. +% - the high impact regime is strange because it consists of trials that moved from unlikely (<20% chance) of termination to a high chance (>80% chance) of termination. Something like 5% of all trials have a greater than 98 percentage point increase in termination. Not sure what this is doing. + +% - Potential Explanations for high impact regime: +How could this intervention have such a wide range in the intensity +and direction of impacts? +A few explanations include that some trials are suceptable or that this is a +result of too little data. +% - Some trials are highly suceptable. This is the face value effect +One option is that some categories are more suceptable to +issues with participant enrollment. +If this is the case, we should be able to isolate categories that contribute +the most to this effect. +Another is that this might be a modelling artefact, due to the relatively +low number of trials in certain IDC-10 categories. +In short, there might be high levels of uncertanty in some parameter values, +which manifest as fat tails in the distributions of the $\beta$ parameters. +Because of the logistic format of the model, these fat tails lead to +extreme values of $p$, and potentally large changes $\delta_p$. +% - Could be uncertanty. If the model is highly uncertain, e.g. there isn't enough data, we could have a small percentage of large increases. This could be in general or just for a few categories with low amounts of data. +% - +% - + +I believe that this second explanation -- a model artifact due to uncertanty -- +is likely to be the cause. +Three points lead me to believe this: +\begin{itemize} + \item The low fractions of E-BFMI suggest that the sampler is struggling + to explore some regions of the posterior. + According to \cite{standevelopmentteam_RuntimeWarnings_2022} this is + often due to thick tails of posterior distributions. + \item When we examine the results across different ICD-10 groups, + \ref{fig:pred_dist_dif_delay2} + \todo{move figure from below} + we note this same issue. + \item In Figure \ref{fig:betas_delay}, we see that some some ICD-10 categories + \todo{add figure} + have \todo{note fat tails}. + \item There are few trials available, particularly among some specific + ICD-10 categories. +\end{itemize} +% NOTE: maybe change order to be ebfmi, group hist-diff or distdiff, tail width, then data size. +% - take a look at beta values and then discuss if that lines up with results from dist-diff by group. +% - My initial thought is that there is not enough data/too uncertain. I think this because it happens for most/all of the categories. % - % - % - % - +Overally it is hard to escape the result that more data is needed, across +many, if not all, of the disease categories. + + % The probability mass associated with a each 10 percentage point change are in table \ref{tab:regimes} % \begin{table}[H] @@ -100,15 +208,6 @@ result comes from different disease categories. \label{fig:pred_dist_dif_delay2} \end{figure} -Overall, we can see that there appear to be some trials or situations -that are highly suceptable to enrollment difficulties, and this -appears to hold for all disease categories for which I have data. -This relative homogeneity of results may be due to the -partial pooling effect from the hierarchal model -and the fact that the sample size per disease is rather small. -An additional explanation is that the variance of the parameter distributions -might be high enough for each trial to have a few situation in which they have -a high probability of terminating. diff --git a/Paper/sections/08_PotentialImprovements.tex b/Paper/sections/08_PotentialImprovements.tex index 2f89ab3..c85cae7 100644 --- a/Paper/sections/08_PotentialImprovements.tex +++ b/Paper/sections/08_PotentialImprovements.tex @@ -12,40 +12,40 @@ The most important step is to increase the number of observations available. Currently this requires matching trials to ICD-10 codes by hand, but there are certainly some steps that can be taken to improve the speed with which this can be done. - -\subsection{Covariance Structure} - -As noted in the diagnostics section, many of the convergence issues seem -to occure in the covariance structure. -Instead of representing the parameters $\beta$ as independently normal: -\begin{align} - \beta_k(d) \sim \text{Normal}(\mu_k, \sigma_k) -\end{align} -I propose using a multivariate normal distribution: -\begin{align} - \beta(d) \sim \text{MvNormal}(\mu, \Sigma) -\end{align} -I am not familiar with typical approaches to priors on the covariance matrix, -so this will require a further literature search as to best practices. - -\subsection{Finding Reasonable Priors} - -In standard bayesian regression, heavy tailed priors are common. -When working with a bayesian bernoulli-logit model, this is not appropriate as -heavy tails cause the estimated probabilities $p_n$ to concentrate around the -values $0$ and $1$, and away from values such as $\frac{1}{2}$ as discussed in -\cite{mcelreath_statistical_2020}. %TODO: double check the chapter for this. - -I indend to take the general approach recommended in \cite{mcelreath_statistical_2020} of using -prior predictive checks to evaluate the implications of different priors -on the distribution on $p_n$. -This would consist of taking the independent variables and predicting the values -of $p_n$ based on a proposed set of priors. -By plotting these predictions, I can ensure that the specific parameter priors -used are consistent with my prior beliefs on how $p_n$ behaves. -Currently I believe that $p_n$ should be roughly uniform or unimodal, centered -around $p_n = \frac{1}{2}$. - +% +% \subsection{Covariance Structure} +% +% As noted in the diagnostics section, many of the convergence issues seem +% to occure in the covariance structure. +% Instead of representing the parameters $\beta$ as independently normal: +% \begin{align} +% \beta_k(d) \sim \text{Normal}(\mu_k, \sigma_k) +% \end{align} +% I propose using a multivariate normal distribution: +% \begin{align} +% \beta(d) \sim \text{MvNormal}(\mu, \Sigma) +% \end{align} +% I am not familiar with typical approaches to priors on the covariance matrix, +% so this will require a further literature search as to best practices. + +% \subsection{Finding Reasonable Priors} +% +% In standard bayesian regression, heavy tailed priors are common. +% When working with a bayesian bernoulli-logit model, this is not appropriate as +% heavy tails cause the estimated probabilities $p_n$ to concentrate around the +% values $0$ and $1$, and away from values such as $\frac{1}{2}$ as discussed in +% \cite{mcelreath_statistical_2020}. %TODO: double check the chapter for this. +% +% I indend to take the general approach recommended in \cite{mcelreath_statistical_2020} of using +% prior predictive checks to evaluate the implications of different priors +% on the distribution on $p_n$. +% This would consist of taking the independent variables and predicting the values +% of $p_n$ based on a proposed set of priors. +% By plotting these predictions, I can ensure that the specific parameter priors +% used are consistent with my prior beliefs on how $p_n$ behaves. +% Currently I believe that $p_n$ should be roughly uniform or unimodal, centered +% around $p_n = \frac{1}{2}$. +% \subsection{Imputing Enrollment} @@ -81,21 +81,17 @@ found a way to do so. \subsection{Improving Measures of Market Conditions} -Finally, the currently employed measure of market conditions -- the number of -brands using the same active ingredients -- is not a very good measure of -the options available to potential participants of a clinical trial. -The ideal measures would capture the alternatives available to treat a given -disease (drug meeting the given indication) at the time of the trial snapshot, -but this data is hard to come by. In addition to the fact that many diseases may be treated by non-pharmaceutical means, off-label prescription of pharmaceuticals is legal at the federal level (\cite{commissioner_understanding_2019}). These two facts both complicate measuring market conditions. - -One dataset that I have only investigated briefly is the \url{DrugCentral.org} -database which tracks official indications and some off-label indications as -well -(\cite{ursu_drugcentral_2017}). +One way to address non-pharmaceutical treatments is to concentrate on domains +that are primarily treated by pharmaceuticals. +This requires domain knowledge that I don't have. +% One dataset that I have only investigated briefly is the \url{DrugCentral.org} +% database which tracks official indications and some off-label indications as +% well +% (\cite{ursu_drugcentral_2017}). \end{document}