You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
JobMarketPaper/Paper/sections/08_PotentialImprovements.tex

74 lines
3.6 KiB
TeX

\documentclass[../Main.tex]{subfiles}
\graphicspath{{\subfix{Assets/img/}}}
\begin{document}
As noted above, there are various issues with the analysis as completed so far.
Below I discuss various issues and ways to address them that I believe will improve the analysis.
\subsection{Increasing number of observations}
The most important step is to increase the number of observations available.
Currently this requires matching trials to ICD-10 codes by hand.
Improvements in Large-Language-Models may make this data more accessible, or
the data may be available in a commercial dataset.
\subsection{Enrollment Modelling}
One of the original goals of this project was to examine the impact that
enrollment struggles have on the probability of trial termination.
Unfortunately, this requires a model of clinical trial enrollment, and the
data is just not in my dataset.
In most cases the trial sponsor reports the anticipated enrollment value
while the trial is still recruiting and only updates the actual enrollment
after the trial has ended.
Some trials do publish an up to date record of their enrollment numbers, but this
is rare.
If a bayesian model of multisite enrollment can be developed for the disease categories
in question, then it will be possible to impute this missing data probabalistically,
which will allow me to estimate the direct effect of slow enrollment
\cite{mcelreath_statistical_2020}.
This does not exist yet, although some work on multi-site enrollment forecasting has
been done by \cite{CHECK ZOTERO NOTES FOR CITATIONS}
\subsection{Improving Population Estimates}
The Global Burden of Disease dataset contains the best estimates of disease
population sizes that I have found so far.
Unfortunately, for some conditions it can be relatively imprecise due to
its focus on providing data geared towards public health policy.
For example, GBD contains categories for both
drug resistant and drug suceptible tuberculosis.
In contrast, there is no category for non-age related macular degeneration.
One resulting concern is that for a given ICD-10 code, the applicable GBD population
estimates may act as an estimate of the upper bound of population size
(\cite{global_burden_of_disease_collective_network_global_2020}).
The dataset contains various measures of disease severity, so it may be
worth investigating how to incorporate some of those measures.
\subsection{Improving Measures of Market Conditions}
% Deficiency: cannot measure effect of market conditions because of endogenetiy of population and market conditions (fatal diseases)
In addition to the fact that many diseases may be treated by non-pharmaceutical
means, off-label prescription of pharmaceuticals is legal at the federal level
(\cite{commissioner_understanding_2019}).
These two facts both complicate measuring market conditions.
One way to address non-pharmaceutical treatments is to concentrate on domains
that are primarily treated by pharmaceuticals.
Another way to address this would be to focus the analysis on just a few specific
diseases, for which a history of treatment options can be compiled.
This second approach may also allow the researcher to distinguish the direction
of causality between population size and number of drugs on the market;
for example, drugs to treat a chronic, non-fatal disease will probably not
affect the market size much in the short to medium term.
This allows the effect of market conditions to be isolated from
the effects of the population.
% Alternative approaches
% - diseases with constant kill rates? population effect should be relatively constant?
\end{document}