diff --git a/Paper/Main.tex b/Paper/Main.tex index 5bfcf16..a8e5857 100644 --- a/Paper/Main.tex +++ b/Paper/Main.tex @@ -24,7 +24,7 @@ \titlespacing*{\paragraph} {0pt}{3.25ex plus 1ex minus .2ex}{1.5ex plus .2ex} -\title{The effects of market conditions and enrollment on the +\title{The effects of open enrollment on the completion of clinical trials\\ \small{Preliminary Draft}} \author{William King} diff --git a/Paper/sections/06_Results.tex b/Paper/sections/06_Results.tex index 107175e..4044d6d 100644 --- a/Paper/sections/06_Results.tex +++ b/Paper/sections/06_Results.tex @@ -81,7 +81,7 @@ open instead of closing enrollment when observed. In figure \ref{fig:pred_dist_diff_delay} below, we see this impact of keeping enrollment open. - +% \begin{minipage}{\textwidth} \begin{figure}[H] \includegraphics[width=\textwidth]{../assets/img/dist_diff_analysis/p_delay_intervention_distdiff_boxplot} \small{ @@ -98,9 +98,38 @@ keeping enrollment open. \label{fig:pred_dist_diff_delay} \end{figure} +\begin{table}[H] + \centering + \caption{Boxplot Summary Statistics} + \label{table:boxplotsummary} + \begin{tabular}{ | c c c c c c c c | } + \hline + 5th & 10th & 25th & median & + 75th & 90th & 95th & mean \\ + \hline + -0.376 & -0.264 & -0.129 & -0.023 & + 0.145 & 0.925 & 0.982 & 0.096 \\ + \hline + \end{tabular} +\end{table} + +% \end{minipage} + +The key figures from the boxplot in figure +\ref{fig:pred_dist_diff_delay} +are sumarized in table \ref{table:boxplotsummary} There are a few interesting things to point out here. Let's start by getting aquainted with the details of the distribution above. -It can be devided into a few different regimes. +A couple more points +First, 63\% of the probability mass is equal to or below zero. +Seconds, about 13\% of the probability mass is contained within the interval +[-0.01,0.01]. +The full 5\% percentile table can be found in table +\ref{TABLE:PercentilesOfDistributionOfDifferences} +in appendix +\ref{Appendix:Results} + +It can also be devided into a few different regimes. % - spike at 0 % - the boxplot % - 63% of mass below 0 : find better way to say that @@ -117,29 +146,15 @@ The second regime consists of the moderate impact on clinical trials' probabilities of termination, say values in the interval $[-0.5, 0.5]$ on the graph. Most of this probability mass is represents a decrease in the probability of -a termination, some of it rather large. -Finally, there exists the high impact region, almost exclusively concentrated -around increases in the probability of termination at $\delta_p > 0.75$. +a termination, some of it rather large decreases. +The third regime consists of the high impact region, +almost exclusively concentrated above increases in the probability of +termination $\delta_p > 0.75$. These represent cases where delaying the close of enrollemnt changes a trial from a case where they were highly likely to complete their primary objectives to a case where they were likely or almost certain to terminate the trial early. % - the high impact regime is strange because it consists of trials that moved from unlikely (<20% chance) of termination to a high chance (>80% chance) of termination. Something like 5% of all trials have a greater than 98 percentage point increase in termination. Not sure what this is doing. -Based on the boxplot below, there are a couple of things to note. -First, the median effect is a 2.3 percentage point decrease -in the probability of termination. -Second, for a random selction from our trials, -there is a 63\% chance that the impact is to -reduce the probability of a termination. -Third, about 13\% of the probability mass is contained within the interval -[-0.1,0.1]. -Finally, the mean effect is measured as a 9.6 percentage point increase in -the probability of termination. -The full percentile table can be found in -\ref{TABLE:PercentilesOfDistributionOfDifferences} -in appendix -\ref{Appendix:Results} - % Looking at the spike around zero, we find that 13.09% of the probability mass % is contained within the band from [-1,1]. % Additionally, there was 33.4282738% of the probability above that @@ -175,7 +190,20 @@ tend to have a similar results: Again, note the high mass near zero, the general decrease in the probability of termination, and then the strong upper tails. -Continuing to the $\beta$ parameters, +Continuing to the $\beta$ parameters in figure +\ref{fig:parameters_ANR_by_group}, +we can see the estimated distributions +the status: \textbf{Active, not recruiting}. +The prior distributions were centered on zero, but we can see that the +pooled learning has moved the mean +values negative, representing reductions in the probability of termination +across the board. +This decrease in the probability of termination is strongest in the categories of Neoplasms ($n=49$), +Musculoskeletal diseases ($n=17$), and Infections and Parasites ($n=20$), the three categories with the most data. +As this is a comparison against the trial status XXX, we note that YYY. +\todo{The natural comparison I want to make is against the Recruting status. Do I want to redo this so that I can read that directly?It shouldn't affect the $\delta_p$ analysis, but this could probably use it. YES, THIS UPDATE NEEDS TO HAPPEN. The base needs to be ``active not recruiting.''} +Overall, this is consistent with the result that extending a clinical trial's enrollment period will reduce the probability of termination. + \begin{figure}[H] \includegraphics[width=\textwidth]{../assets/img/betas/parameter_across_groups/parameters_12_status_ANR} \caption{Distribution of parameters associated with ``Active, not recruiting'' status, by ICD-10 Category} @@ -183,15 +211,6 @@ Continuing to the $\beta$ parameters, \end{figure} % - -Finally, in figure \ref{fig:parameters_ANR_by_group}, we can see the estimated distributions of the $\beta$ parameter for -the status: \textbf{Active, not recruiting}. -The prior distributions were centered on zero, but we can see that the pooled learning has moved the mean -values negative, representing reductions in the probability of termination across the board. -This decrease in the probability of termination is strongest in the categories of Neoplasms ($n=49$), -Musculoskeletal diseases ($n=17$), and Infections and Parasites ($n=20$), the three categories with the most data. -As this is a comparison against the trial status XXX, we note that -\todo{The natural comparison I want to make is against the Recruting status. Do I want to redo this so that I can read that directly?It shouldn't affect the $\delta_p$ analysis, but this could probably use it. YES, THIS UPDATE NEEDS TO HAPPEN. The base needs to be ``active not recruiting.''} -Overall, this suggests that extending a clinical trial's enrollment period will reduce the probability of termination. % - Potential Explanations for high impact regime: This leads to the question: @@ -201,12 +220,15 @@ The most likely explanations in my mind are that either some trials are highly suceptable to enrollment struggles or that this is a modelling artifact. % - Some trials are highly suceptable. This is the face value effect -The first option -- that some categories are more suceptable to +The first option -- that some trials are more suceptable to issues with participant enrollment -- should allow us to isolate categories or trials that contribute the most to this effect. -In figure -\ref{fig:pred_dist_dif_delay2}, it appears that most of the trials have -this high impact regime at $\delta_p > 0.75$. +This is not what we find when we inspect the categories +in figure +\ref{fig:pred_dist_dif_delay2}. +Instead it appears that most of the categories have this high +impact regime when $\delta_p > 0.75$, although the maximum value +of this regime varies considerably. Another explanation is that this is a modelling artefact due to priors with strong tails and the relatively low number of trials in @@ -221,7 +243,9 @@ A few things lead me to believe this: \begin{itemize} \item The low fractions of E-BFMI suggest that the sampler is struggling to explore some regions of the posterior. - According to \cite{standevelopmentteam_RuntimeWarnings_2022} this is + According to + \cite{standevelopmentteam_runtimewarningsconvergence_2022} + this is often due to thick tails of posterior distributions. During earlier analysis, when I had about 100 trials, the number of warnings was significantly higher. @@ -234,14 +258,14 @@ A few things lead me to believe this: we see that most ICD-10 categories have fat tails in the $\beta$s, even among the categories relatively larger sample sizes. - - \end{itemize} Overally it is hard to escape the conclusion that more data is needed across many -- if not all -- of the disease categories. At the same time, the median result is a decrease in the probability of termination when the enrollment period is held open. +My inclination is to believe that the overall effect is to reduce the +probability of termination. \end{document} diff --git a/Paper/sections/22_appendix_full_results.tex b/Paper/sections/22_appendix_full_results.tex index 647bda3..e4da2c2 100644 --- a/Paper/sections/22_appendix_full_results.tex +++ b/Paper/sections/22_appendix_full_results.tex @@ -3,11 +3,11 @@ \begin{document} - -\begin{center} +\begin{table}[h!] + \caption{Table of Percentiles of Distribution of Differences} \label{TABLE:PercentilesOfDistributionOfDifferences} - % \caption{Table of Percentiles of Distribution of Differences} - \begin{tabular}{cc} +\centering + \begin{tabular}{c c} \hline Percentile & Value \\ \hline @@ -34,6 +34,10 @@ 100\% & 1.0000000 \\ \hline \end{tabular} -\end{center} +\end{table} + +% This is here specifically to allow the table above to compile. Not sure why it is needed... +\begin{table}[h!] +\end{table} \end{document} diff --git a/todo.org b/todo.org index b2d3c38..be0e642 100644 --- a/todo.org +++ b/todo.org @@ -47,3 +47,14 @@ **** TODO Redo analysis using "Recruitng" as the base status The goal is to get the $\beta$'s for active, not recruitng. +**** TODO Rerun analysis with correct base + [[[[file:/mnt/backups/home/dad/research/PhD_Deliverables/JobMarketPaper/Paper/sections/06_Results.tex::204]]]] + The natural comparison I want to make is against the Recruting status. + Do I want to redo this so that I can read that directly? + It shouldn't affect the $\delta_p$ analysis, but this could probably use it. + YES, THIS UPDATE NEEDS TO HAPPEN. The base needs to be ``active not recruiting.'' + + So the plan is to set ``Active, not recruiting'' as the base condition, then + measure the effect when that is chagned to ``Recruiting''. + If that is negative, then extending recruiting reduces the probability of + termination.