@ -2,16 +2,25 @@
\graphicspath { { \subfix { Assets/img/} } }
\graphicspath { { \subfix { Assets/img/} } }
\begin { document}
\begin { document}
The computational approach I have decided to take is an application of
The computational approach I take is based on
\cite { Maliar2019} , where the policy function is approximated using a
\cite { Maliar2019} 's Bellman Residual Minimization, with the
neural network.
policy and value functions are approximated using a neural network.
In summary the bellman equation is rewritten in the form:
The approach uses the fact that the euler equation implicitly defines the
\begin { align}
optimal policy function, for example:
Q = V(S_ T,D_ t) - F(S_ t,D_ t,X_ t(S_ t,D_ t)) -\beta V(S_ { t+1} ,D_ { t+1} )
$ [ 0 ] = f ( x ( \theta ) , \theta ) $ .
\end { align}
This can easily be turned into a mean square objective function,
With a policy maximization condition such as:
$ 0 = f ^ 2 ( x ( \theta ) , \theta ) $ ,
\begin { align}
allowing one to find $ x ( \dot ) $ as the solution to a minimization problem.
M = \left [ F(S_t,D_t,X_t(S_t,D_t)) + \beta V(S_{t+1},D_{t+1})\right]
\end { align}
In the deterministic case, a loss function can be constructed in
either of the following equivalent cases:
\begin { align}
\phi _ 1 = Q^ 2 - vM \\
\phi _ 2 = \left (M - Q - \frac { v} { 2} \right )^ 2 - v \cdot \left (Q + \frac { v} { 4} \right )
\end { align}
where $ v $ is an external weighting parameter which can be cross validated.
By choosing a neural network as the functional approximation, we are able to
By choosing a neural network as the functional approximation, we are able to
use the fact that a NN with a single hidden layer can be used to approximate
use the fact that a NN with a single hidden layer can be used to approximate
@ -20,129 +29,139 @@ under certain conditions \cref{White1990}.
We can also
We can also
take advantage of the significant computational and practical improvements
take advantage of the significant computational and practical improvements
currently revolutionizing Machine Learning.
currently revolutionizing Machine Learning.
In particular, we can now use common frameworks, such as python, PyTorch,
Some examples include the use of specialized hardware and the ability to transfer
and various online accerators (Google Colab)
learning between models, both of which can speed up functional approximation.
which have been optimized for relatively high performance and
straightforward development.
\subsection { Computational Plan}
\subsection { Computational Plan}
I have decided to use python and the PyTorch Neural Network library for this project.
The neural network library I've chosen to use is Flux.jl \cref { Flux.jl-2018} ,
a Neural Network library implmented in and for the Julia language,
although the Bellman Residual Minimization algorithm would work equally well in
PyTorch or TensorFlow
\footnote {
The initial reason I investigated Flux/Julia is due to the source to source
Automatic Differentiation capabilities, which I intended to use to implement
a generic version of \cref { Maliar2019} 's euler equation iteration method.
While I still believe this is possible and that Flux represents one of the
best tools available for that specific purpose,
I've been unsuccessful at implementing the algorithm.
} .
Below I note some of the design, training, and implementation decisions.
% Data Description
The data used to train the network is simulated data, pulled from random distributions.
One advantage of this approach is that by changing the distribution, the emphasis
in the training changes.
Initially training can be focused on certain areas of the state space, but later
training can put the focus on other areas as their importance is recognized.
In the case that we don't know which data areas to investigate, it is possible to
optimize over a given dataset, and the iterate stocks and debris forward
many periods.
If the debris and stocks don't line up well with the initial training dataset,
we can change the distribution to cover the stocks and debris from the iteration,
thus bootstrapping the distribution of the training set.
\subsubsection { Constellation Operators}
% Operators
% Branched Policy Topology
% Individual Value functions
% Training Loop
Although there are multiple operators, the individual policy functions
show up jointly as the code is currently implemented.
For this reason, I've implemented each operator's policy function
as a ``branch'' within a single neural network.
These branches are configured such that they each recieve the same
inputs (stocks and debris), but decisions in each branch are made without reference
These results are then concatenated together into the final policy vector.
When training a given operator, the appropriate branch is unfrozen so that operator can train.
Value functions are implemented as unique neural networks at the constellation operator level,
much like the operator's bellman residual function.
The training loops take the form of:
For each epoch
\begin { enumerate}
\item generate data
\item for each operator
\begin { enumerate}
\item Unfreeze branch
\item Train policy function on data
\item Freeze branch
\item Train Value function on data
\end { enumerate}
\item Check termination conditions
\end { enumerate}
Overall, this allows for each operator's policy and value functions to be approximated
on it's own bellman residuals, while maintaining a convenient interface.
\subsubsection { Planner}
% Planner
% policy topology
% Value function topology
% Training loop
The policy function for the Fleet Planner does not require any separate branches,
although it could if desired for comparison purposes.
The key point though, is that no parameter freezing is done during training,
allowing the repercussions on other constellations to be taken into account.
Similarly there is a single neural network used to estimate the value function.
The training loops take the form of:
For each epoch
\begin { enumerate}
\item generate data
\begin { enumerate}
\item Train policy function on data
\item Train Value function on data
\end { enumerate}
\item Check termination conditions
\end { enumerate}
\subsubsection { Heterogeneous Agents and Nash Equilibria}
One key question is how to handle the case of heterogeneous agents.
In the processes outlined above, the heterogeneous agents are simply
identified by their position in the state and action vectors and
then the NN learns how to operate with each of them\footnote {
I believe it may be possible to create some classifications of
different heterogeneous agent types that allows for simpler function transfers,
but the implementation will take some extensive code design work.
} .
The most difficult step is creating the euler equations.
When working with high dimensioned problems involving differentiation,
three general computational approaches exist:
\begin { itemize}
\item Using a symbolic library (sympy) or language (mathematica) to create the
euler equations.
This has the disadvantage of being (very) slow, but the advantage that
for a single problem specification it only needs completed once.
It requires taking a matrix inverse, which can easily complicate formulas
and is computationally complex, approximately $ O ( n ^ 3 ) $ algorithm.
\item Using numerical differentiation (ND).
The primary issue with ND is that errors can grow quite quickly when
performing algebra on numerical derivatives.
This requires tracking how errors can grow and compound within your
specific formulation of the problem.
\item Using automatic differentiation (AD) to differentiate the computer code
directly.
This approach has a few major benefits.
\begin { itemize}
\item Precision is high, because you are calcuating symbolic
derivatives of your computer functions.
\item ML is heavily dependent on AD, thus the tools are plentiful
and tested.
\item The coupling of AD and ML lead to a tight integration with
the neural network libraries, simplifying the calibration procedure.
\end { itemize}
\end { itemize}
I have chosen to use the AD to generate a euler equation function, which will
then be the basis of our objective function.
The first step is to construct the intertemporal transition functions
(e.g \ref { put_ refs_ here} ).
% Not sure how much detail to use.
% I'm debating on describing how it is done.
These take derivatives of the value function at time $ t $ as an input, and output
derivatives of the value function at time $ t + 1 $ .
Once this function has been finished, it can be combined with the laws of motion
in an iterated manner to transition between times $ t $ and times $ t + k $ .
I did so by coding a function that iteratively compose the transition
and laws of motion functions, retuning a $ k $ -period transition function.
The second step is to generate functions that represent the optimality conditions.
By taking the appropriate derivatives with respect to the laws of motion and
benefit functions, this can be constructed explicitly.
Once these two functions are completed, they can be combined to create
the euler equations, as described in appendix \ref { APX:Derivations:EulerEquations} .
% % % Is it FaFCCs or recursion that allows this to occur?
% % % I believe both are ways to approach the problem.
% \paragraph { Functions As First Class Citizens}
% The key computer science tool that makes this possible is the concept
% of ``functions as first class citizens'' (FaFCCs).
% In every computer language there are primitive values that functions
% operate on.
% When a language considers FaFCCs, functions are one of the primitives
% that functions can operate on.
% This is how we can get
% AD in pytorch does not work by FaFCC though, instead constructing a computational graph.
\paragraph { Training}
With the euler equation and resulting objective function in place,
standard training approachs can be used to fit the function.
I plan on using some variation on stochastic gradient descent.
Normally, neural networks are trained on real world data.
As this is a synthetic model, I am planning on training it on random selections
from the state space.
If I can data on how satellites are and have been distributed, I plan on
selecting from that distribution.
\paragraph { Heterogeneous Agents}
One key question is how to handle the case of heterogeneous agents.
When the laws of motion depend on other agents' decisions, the opportunity
When the laws of motion depend on other agents' decisions, as is the case
for Nash and other game theoretic equilibria to arise.
described in \ref { SEC:Laws} , intertemporal iteration may
One benefit of using neural networks is that they can find standard equilibrium concepts,
require knowing the other agents best response function.
including mixed nash equilibria if configured properly.
I believe I can model this in the constellation operator's case
% concerns about nash computability
by solving for the policy functions of each class of operator
simultaneously.
I would like to verify this approach as I have not dived into
some of the mathemeatics that deeply.
\subsection { Functional Forms}
\subsection { Functional Forms}
The simpleset functional forms for the model are similar to those in
The reference functional forms for the model are similar to those
\autocite { RaoRondina2020} , giving:
given in \autocite { RaoRondina2020} .
\begin { itemize}
\begin { itemize}
\item The per-period benefit function:
\item The linear per-period benefit function:
\begin { align}
\begin { align}
u^ i(\{ s^ j_ t\} , D_ t) = \pi s^ i_ t
u^ i(S_ t, D_ t, X_ t) = \pi s^ i_ t - f \cdot x^ i_ t
\end { align}
\end { align}
\item The launch cost function:
\item Each constellation's satellite survival function:
\begin { align}
\begin { align}
F(x^ i_ t) = f \cdot x^ i_ t
R^ i(S_ t, D_ t) = e^ { - d\cdot D_ t - \sum ^ N_ { j=1} h^ j s^ j_ t}
\end { align}
\item The satellite destruction rate function:
\begin { align}
l^ i(\{ s^ j_ t\} , D_ t) = 1 - e^ { - d\cdot D_ t - \sum ^ N_ { j=1} h^ j s^ j_ t}
\end { align}
\item The debris autocatalysis function:
\begin { align}
g(D_ t) = g\cdot D_ t
\\
g > 1
\end { align}
\end { align}
\end { itemize}
\end { itemize}
\subsection { Existence concerns}
\subsubsection { Parameter Values}
% I'm just guessing.
Currently, I've not found a way to estimate the proper parameters to use,
and there needs to be a discussion of how to calibrate those parameters.
So far, my goal is to choose parameters with approximately
the correct order of magnitude.
% \subsection { Existence concerns}
% check matrix inverses etc.
% check matrix inverses etc.
%
%
I am currently working on a plan to guarantee existence of solutions.
% I am currently working on a plan to guarantee existence of solutions.
Some of what I want to do is check numerically crucial values and
% Some of what I want to do is check numerically crucial values and
mathematically necessary conditions for existence and uniqueness.
% mathematically necessary conditions for existence and uniqueness.
Unfortunately this is little more than just a plan right now.
% Unfortunately this is little more than just a plan right now.
\end { document}
\end { document}