\documentclass[../Main.tex]{subfiles} \graphicspath{{\subfix{Assets/img/}}} \begin{document} %% Describe goal % Estimate probability distribution of normalized durations and conclusion statuses. % Explain why this answers questions well. % How do I propose estimating that? %%NOTATION First, some notation: \begin{itemize} \item $n$: indexes trial snapshots. \item $y_n$: whether each trial terminated (true) or completed (false). \item $d$: indexes ICD-10 disease categories. \item $d_n$: represents the disease category of the trial associated with the snapshot $n$. \item $x_n$: represents the other dependent variables associated to the snapshot. This includes\footnote{No trials in the current dataset are ever suspended.}: \begin{enumerate} \item Elapsed duration \item arcsinh of the number of brands \item arcsinh of the DALYs from high SDI countries \item arcsinh of the DALYs from high-medium SDI countries \item Enrollment (no distinction between anticipated or actual) \item Dummy Status: Not yet recruiting \item Dummy Status: Recruiting \item Dummy Status: Active, not recruiting \item Dummy Status: Enrolling by invitation \end{enumerate} \end{itemize} The arcsinh transform is used because it is similar to a log transform but maps $\text{arcsinh}(0)=0$. The bayesian model to measure the direct effects of enrollment and the number of other brands is easily specified as a hierarchal logistic regression. \begin{align} y_n \sim \text{Bernoulli}(p_n) \\ p_n = \text{logit}(x_n \vec \beta(d_n)) \end{align} Where beta is indexed by $k$ for each parameter in $x$, and by $d \in \{1,2,\dots,21,22\}$ for each general ICD-10 category. The betas are distributed \begin{align} \beta_k(d) \sim \text{Normal}(\mu_k,\sigma_k) \end{align} With hyperparameters \begin{align} \mu_k \sim \text{Normal}(0,1) \\ \sigma_k \sim \text{Gamma}(2,1) \end{align} Other variables are implicitly conditioned on as they were used to select trials of interest. These include: \begin{itemize} \item Is the trial Phase 3?\footnote{ Conditioning on phase 3 is equivalent to asserting that previous trials occured and had acceptable safety and efficacy results. } \item Does the trial have a Data Monitoring Committee? \item Are the compounds an FDA regulated drug? \end{itemize} %TODO: double check the sql used to select trials of interest. \end{document}