finished drafting results

1 year ago · 86f9b8dfc9
parent 64f3d14f7b
commit 86f9b8dfc9
2 changed files with 176 additions and 81 deletions
--- a/Paper/sections/06_Results.tex
+++ b/Paper/sections/06_Results.tex
@ -7,24 +7,73 @@ I describe the model fitting, the posteriors of the parameters of interest,
 and intepret the results.
-\subsection{Estimation Procedure}
+\subsection{Data Summaries and Estimation Procedure}
 % Data Summaries
 Overall, I successfully processed 162 trials, with 1,347 snapshots between them.
 Figure \ref{fig:snapshot_counts} shows the histogram of snapshots per trial.
 Most trials lasted less than 1,500 days, as can be seen in 
 \ref{fig:trial_durations}. 
 Although there are a large number of snapshots that will be used to fit the 
 model, the number of trials -- the unit of observation -- are quite low. 
 Add to the  fact that these are spread over multiple IDC-10 categories
 and the overall quantity of trials is quite low. 
 To continue, we can use a scatterplot to get a rough idea of the observed
 relationship between the number of snapshots and the duration of trials. 
 We can see this in Figure \ref{fig:snapshot_duration_scatter}, where
 the correlation (measured at $0.34$) is apparent.
 \begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
    \todo{Replace this graphic with the histogram of trial durations}
    \caption{Histograms of Trial Durations}
    \label{fig:trial_durations}
 \end{figure}
 \begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
    \todo{Replace this graphic with the histogram of snapshots}
    \caption{Histogram of the count of Snapshots}
    \label{fig:snapshot_counts}
 \end{figure}
 \begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
    \todo{Replace this graphic with the scatterplot comparing durations and snapshots}
    \caption{Scatterplot comparing the Count of Snapshots and Trial Duration}
    \label{fig:snapshot_counts}
 \end{figure}
 % Estimation Procedure
 I fit the econometric model using mc-stan 
 \cite{standevelopmentteam_StanModelling_2022}
 through the rstan 
 \cite{standevelopmentteam_RStanInterface_2023}
-interface.
+interface using 4 chains with 
 I had X Trials with X snapshots in total. \todo{Fill out.} 
 %describe  
-X\todo{UPDATE VALUES} 
+2,500
 warmup iterations and
-X\todo{UPDATE VALUES} 
+2,500
-sampling iterations in six chains.
+sampling iterations each.
 Two of the chains experienced a low 
 Estimated Baysian Fraction of Missing Information (E-BFMI) ,
 suggesting that there are some parts of the posterior distribution
 that were not explored well during the model fitting. 
 I presume this is due to the low number of trials in some of the 
 IDC-10 categories.
 We can see in Figure \ref{fig:barchart_idc_categories} that some of these 
 disease categories had a single trial represented while others were 
 not represented at all.
-% \subsection{Data Exploration} 
+\begin{figure}[H]
-% \todo{fill this out later.}
+    \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
-%look at trial 
+    \todo{Replace this graphic with the barchart of trials by categories.}
    \caption{Bar chart of trials by IDC-10 categories}
    \label{fig:barchart_idc_categories}
 \end{figure}
 \subsection{Primary Results}
@ -38,6 +87,7 @@ keeping enrollment open.
 \begin{figure}[H]
    \includegraphics[width=\textwidth]{../assets/img/current/pred_dist_diff-delay}
    \todo{Replace this graphic with the histdiff with boxplot}
    \small{
        Values near 1 indicate a near perfect increase in the probability 
        of termination. 
@ -52,26 +102,84 @@ keeping enrollment open.
    \label{fig:pred_dist_diff_delay}
 \end{figure}
-We can see from figure 
+There are a few interesting things to point out here. 
-\ref{fig:pred_dist_diff_delay} 
+Let's start by getting aquainted with the details of the distribution above.
-That there are roughly four regimes. 
+% - spike at 0
-The first consists of trials that experiences nearly no effect,
+% - the boxplot
-i.e. have values near zero.
+% - 63% of mass below 0 : find better way to say that
-Trials in the second regime experience a mild to large reduction in 
+%   - For a random trial, there is a 63% chance that the impact is to reduce the probability of a termination.
-the probability of termination, with X percent of the probability mass 
+% - 2 pctg-point wide band centered on 0 has ~13% of the masss
-between about 5 percentage points and 50 percentage point  reductions.
+% - mean represents 9.x% increase in probability of termination. A quick simulation gives about the same pctg-point increase in terminated trials.
-The third regime is those trials that experience a mild to large 
+
-increase in the probability of termination, 
+A few interesting interpretation bits come out of this.
-from an increase o 5 percentage points to about 75 percentage points. 
+% - there are 3 regimes: low impact (near zero), medium impact (concentrated in decreased probability of termination), and high impact (concentrated in increased probability of termination). 
-The fourth and final regime is the X\% of trials that experience a significant
+The first this that there appear to be three different regimes. 
-(greater than 75 percentage point) increase in the probability of 
+The first regime consists of the low impact results, i.e. those values of $\delta_p$ 
-termination.
+near zero. 
-%Notes on interpretation
+About 13\% of trials lie within a single percentage point change of zero, 
-% - increase vs decrease on graph 
+suggesting that there is a reasonable chance that delaying 
 a close of enrollment has no impact. 
 The second regime consists of the moderate impact on clinical trials'
 probabilities of termination, say values in the interval $[-0.5, 0.5]$ 
 on the graph.
 Most of this probability mass is represents a decrease in the probability of 
 a termination, some of it rather large.
 Finally, there exists the high impact region, almost exclusively concentrated 
 around increases in the probability of termination at $\delta_p > 0.75$. 
 These represent cases where delaying the close of enrollemnt changes a trial
 from a case where they were highly likely to complete their primary objectives to 
 a case where they were likely or almost certain to terminate the trial early.
 %   - the high impact regime is strange because it consists of trials that moved from unlikely (<20% chance) of termination to a high chance (>80% chance) of termination. Something like 5% of all trials have a greater than 98 percentage point increase in termination. Not sure what this is doing. 
 %   - Potential Explanations for high impact regime:
 How could this intervention have such a wide range in the intensity 
 and direction of impacts?
 A few explanations include that some trials are suceptable or that this is a 
 result of too little data.
 %       - Some trials are highly suceptable. This is the face value effect
 One option is that some categories are more suceptable to 
 issues with participant enrollment. 
 If this is the case, we should be able to isolate categories that contribute
 the most to this effect.
 Another is that this might be a modelling artefact, due to the relatively
 low number of trials in certain IDC-10 categories. 
 In short, there might be high levels of uncertanty in some parameter values,
 which manifest as fat tails in the distributions of the $\beta$ parameters. 
 Because of the logistic format of the model, these fat tails lead to 
 extreme values of $p$, and potentally large changes $\delta_p$. 
 %       - Could be uncertanty. If the model is highly uncertain, e.g. there isn't enough data, we could have a small percentage of large increases. This could be in general or just for a few categories with low amounts of data.
 % - 
 % - 
 I believe that this second explanation -- a model artifact due to uncertanty --
 is likely to be the cause. 
 Three points lead me to believe this:
 \begin{itemize}
    \item The low fractions of E-BFMI suggest that the sampler is struggling 
        to explore some regions of the posterior. 
        According to \cite{standevelopmentteam_RuntimeWarnings_2022} this is 
        often due to thick tails of posterior distributions.
    \item When we examine the results across different ICD-10 groups, 
        \ref{fig:pred_dist_dif_delay2}
        \todo{move figure from below}
        we note this same issue.
    \item In Figure \ref{fig:betas_delay}, we see that some some ICD-10 categories
        \todo{add figure}
        have \todo{note fat tails}.
    \item There are few trials available, particularly among some specific 
        ICD-10 categories.
 \end{itemize}
 % NOTE: maybe change order to be ebfmi, group hist-diff or distdiff, tail width, then data size.
 %           - take a look at beta values and then discuss if that lines up with results from dist-diff by group. 
 %       - My initial thought is that there is not enough data/too uncertain. I think this because it happens for most/all of the categories.
 % - 
 % - 
 % - 
 % - 
 Overally it is hard to escape the result that more data is needed, across
 many, if not all, of the disease categories.
 % The probability mass associated with a each 10 percentage point change are in table \ref{tab:regimes}
 % \begin{table}[H]
@ -100,15 +208,6 @@ result comes from different disease categories.
    \label{fig:pred_dist_dif_delay2}
 \end{figure}
 Overall, we can see that there appear to be some trials or situations 
 that are highly suceptable to enrollment difficulties, and this 
 appears to hold for all disease categories for which I have data.
 This relative homogeneity of results may be due to the 
 partial pooling effect from the hierarchal model 
 and the fact that the sample size per disease is rather small.
 An additional explanation is that the variance of the parameter distributions
 might be high enough for each trial to have a few situation in which they have
 a high probability of terminating.
--- a/Paper/sections/08_PotentialImprovements.tex
+++ b/Paper/sections/08_PotentialImprovements.tex
@ -12,40 +12,40 @@ The most important step is to increase the number of observations available.
 Currently this requires matching trials to ICD-10 codes by hand, but
 there are certainly some steps that can be taken to improve the speed with which
 this can be done.
-
+%
-\subsection{Covariance Structure}
+% \subsection{Covariance Structure}
-
+%
-As noted in the diagnostics section, many of the convergence issues seem
+% As noted in the diagnostics section, many of the convergence issues seem
-to occure in the covariance structure. 
+% to occure in the covariance structure. 
-Instead of representing the parameters $\beta$ as independently normal:
+% Instead of representing the parameters $\beta$ as independently normal:
-\begin{align}
+% \begin{align}
-    \beta_k(d) \sim \text{Normal}(\mu_k, \sigma_k)
+%     \beta_k(d) \sim \text{Normal}(\mu_k, \sigma_k)
-\end{align}
+% \end{align}
-I propose using a multivariate normal distribution:
+% I propose using a multivariate normal distribution:
-\begin{align}
+% \begin{align}
-    \beta(d) \sim \text{MvNormal}(\mu, \Sigma)
+%     \beta(d) \sim \text{MvNormal}(\mu, \Sigma)
-\end{align}
+% \end{align}
-I am not familiar with typical approaches to priors on the covariance matrix,
+% I am not familiar with typical approaches to priors on the covariance matrix,
-so this will require a further literature search as to best practices.
+% so this will require a further literature search as to best practices.
-
+
-\subsection{Finding Reasonable Priors}
+% \subsection{Finding Reasonable Priors}
-
+%
-In standard bayesian regression, heavy tailed priors are common. 
+% In standard bayesian regression, heavy tailed priors are common. 
-When working with a bayesian bernoulli-logit model, this is not appropriate as 
+% When working with a bayesian bernoulli-logit model, this is not appropriate as 
-heavy tails cause the estimated probabilities $p_n$ to concentrate around the 
+% heavy tails cause the estimated probabilities $p_n$ to concentrate around the 
-values $0$ and $1$, and away from values such as $\frac{1}{2}$ as discussed in
+% values $0$ and $1$, and away from values such as $\frac{1}{2}$ as discussed in
-\cite{mcelreath_statistical_2020}. %TODO: double check the chapter for this.
+% \cite{mcelreath_statistical_2020}. %TODO: double check the chapter for this.
-
+%
-I indend to take the general approach recommended in \cite{mcelreath_statistical_2020} of using
+% I indend to take the general approach recommended in \cite{mcelreath_statistical_2020} of using
-prior predictive checks to evaluate the implications of different priors
+% prior predictive checks to evaluate the implications of different priors
-on the distribution on $p_n$.
+% on the distribution on $p_n$.
-This would consist of taking the independent variables and predicting the values
+% This would consist of taking the independent variables and predicting the values
-of $p_n$ based on a proposed set of priors. 
+% of $p_n$ based on a proposed set of priors. 
-By plotting these predictions, I can ensure that the specific parameter priors 
+% By plotting these predictions, I can ensure that the specific parameter priors 
-used are consistent with my prior beliefs on how $p_n$ behaves.
+% used are consistent with my prior beliefs on how $p_n$ behaves.
-Currently I believe that $p_n$ should be roughly uniform or unimodal, centered 
+% Currently I believe that $p_n$ should be roughly uniform or unimodal, centered 
-around $p_n = \frac{1}{2}$.
+% around $p_n = \frac{1}{2}$.
-
+%
 \subsection{Imputing Enrollment}
@ -81,21 +81,17 @@ found a way to do so.
 \subsection{Improving Measures of Market Conditions}
 Finally, the currently employed measure of market conditions -- the number of 
 brands using the same active ingredients -- is not a very good measure of 
 the options available to potential participants of a clinical trial.
 The ideal measures would capture the alternatives available to treat a given
 disease (drug meeting the given indication) at the time of the trial snapshot, 
 but this data is hard to come by.
 In addition to the fact that many diseases may be treated by non-pharmaceutical 
 means, off-label prescription of pharmaceuticals is legal at the federal level 
 (\cite{commissioner_understanding_2019}).
 These two facts both complicate measuring market conditions.
-
+One way to address non-pharmaceutical treatments is to concentrate on domains
-One dataset that I have only investigated briefly is the \url{DrugCentral.org}
+that are primarily treated by pharmaceuticals.
-database which tracks official indications and some off-label indications as 
+This requires domain knowledge that I don't have.
-well
+% One dataset that I have only investigated briefly is the \url{DrugCentral.org}
-(\cite{ursu_drugcentral_2017}).
+% database which tracks official indications and some off-label indications as 
 % well
 % (\cite{ursu_drugcentral_2017}).
 \end{document}