F_{min} $ \item The hydrogens are separated by more than a minimum distance: \hfill $\norm{\vec{r}_{HH^{(0)}}} > d_{min}$ \item The hydrogens are separated by less than a maximum distance: \hfill $\norm{\vec{r}_{HH^{(0)}}} < d_{max}$ \end{enumerate} The roaming hydrogen, designated $H^{(0)}$, is uniquely identified in configuration space as the hydrogen with the greatest Euclidean separation from total center of mass. I define the restoring force on the roaming hydrogen as: \begin{equation} F_{H^{(0)}} = - \hat{r}_{H^{(0)}-CM} \cdot \vec{\nabla}_{H^{(0)}} V(\vec{R}) \end{equation} where \begin{equation} \vec{\nabla}_{H^{(0)}} = \left( \frac{\partial}{\partial x_{H^{(0)}}} , \frac{\partial}{\partial y_{H^{(0)}}} , \frac{\partial}{\partial z_{H^{(0)}}}\right) \end{equation} \begin{equation} \vec{r}_{H^{(0)}-CM} = \vec{r}_{H^{(0)}} - \vec{r}_{CM} \end{equation} and $V(\vec{R})$ is the formaldehyde potential energy function; $\hat{r}_{H^{(0)}-CM}$ is a unit vector defined in the usual way. The constants in the criteria are given below and will be justified in a forthcoming appendix. \begin{center} \begin{tabular}{l l} Parameter & Value \\ \hline $F_{min}$ & $-2.5\times 10^{-4}$ $E_H/a_0$ \\ $d_{min}$ & 5.86728 $a_0$ \\ $d_{max}$ & 9.00000 $a_0$ \end{tabular} \end{center} Briefly, however: \begin{enumerate} \item $F_{H^{(0)}}$ is the force on the roaming hydrogen towards the center of mass. As usual, negative values are attractive. I place a minimum on the force because in roaming trajectories, we observe large, persistent \ce{H} - \ce{HCO} separations, which require low attractive forces on the wayfaring hydrogen. \item A minimum separation between the hydrogens in required because otherwise equilibrium and transition state configurations, clearly not representative of roaming, would be mis-classified as roaming. I use the hydrogen separation as opposed to \ce{H} - \ce{HCO} separation because large hydrogen - formyl separation can also correspond to dissociation to molecular products. \item A maximum hydrogen separation is also imposed to exclude radical dissociation, which otherwise comprises a majority of the space. \end{enumerate} These criteria can be formally encoded in a function as follows: \begin{align}\label{eqn:p-roaming} P_{roaming}(\vec{R}; F_{min}, d_{min}, d_{max}) \propto \; &\Theta \left( F_{H^{(0)}} - F_{min} \right) \\ \cdot &\Theta \left( \norm{\vec{r}_{HH^{(0)}}} - d_{min} \right) \\ \cdot &\Theta \left(d_{max} - \norm{\vec{r}_{HH^{(0)}}} \right) \end{align} where $\Theta$ is the step function: \begin{equation} \Theta(x) = \begin{cases} 1 &, x>0 \\ 0 &, \text{otherwise} \end{cases} \end{equation} To sample the potential energy landscape ensemble \cite{wang:2007:pele}, we can use a similar scheme: \begin{equation}\label{eqn:p-pele} P_{pele}(\vec{R}; E_L) \propto \Theta \left( E_L - V(\vec{R}) \right) \end{equation} where $E_L$ is the landscape energy. Combining equation \ref{eqn:p-pele} with \ref{eqn:p-roaming} allows us to sample the portion of the roaming region that is also in a given potential energy landscape ensemble. This is desirable because it allows us to set the landscape energy for sampled configurations in advance. Perhaps this is an element of the technique that would be useful to other members of the group. It should be noted that since our combined function has binary outputs, the probabilities in \refeq{markovSequence} will always be 0 or 1. This is a special case of the more general class of problems than the Metropolis algorithm is capable of handling---within the allowed region, we seek uniform sampling rather than a variable density. Put another way, because we have no notion of ``sort of roaming'' or ``sort of in the potential energy landscape ensemble'', the random walk will never do something ``sort of bad'' \emph{i.e.}, will never take ``uphill steps''. When Brady and co-workers\cite{doll:1981} implemented this scheme, it was to sample the energy shell of the microcanonical ensemble. While there is no notion of ``sort of the right energy'' for the microcanonical ensemble, they used a pre-limit form of the delta function for their probability, which allowed their walker to meander back and forth across the energy shell. \subsection{Picking a Scaling Factor}\label{subsec:pick-constant} For the linear propagator described in section \ref{subsec:details}, we would like to optimize the scaling factor, $\delta_x$, such that the acceptance ratio falls within the desired bounds. The binary probability function could introduce some added complications, but we avoid them under the following assumption: \begin{itemize} \item The set to be sampled is dense; that is, points in the set are arbitrarily close to others in the set \end{itemize} This condition allows us to assume that for sufficiently small steps, moves will \emph{always} fall within the acceptance region. This is important because it allows us to take the $\delta_x \to 0$ limit as yielding an acceptance ratio of 1. In the case of a binary probability function, the other limit, $\delta \to \infty$, implies an acceptance ratio equal to the value $\langle P(\vec{x}) \rangle$. In the case of the roaming region of formaldehyde, this average is 0. It would also be convenient to assume that the set is connected; that is, it is possible to translate between any two points in the set while remaining in the set. This a statement of the ergodicity of the system and is actually a stronger requirement than the first assumption. It allows us to assume that the entire set is reachable from any $x_0$. \emph{However}, in the potential energy landscape ensemble, we have no guarantees that this assumption will hold. The general scheme for selecting $\delta_x$ is this: \begin{enumerate} \item Guess an initial value for ${\delta}_{x}$ \item \label{it:reg:start} Compute a Markov chain of length $k$ while keeping track of the acceptance ratio \item If the ratio is larger than desired, scale by $(1+\alpha)$ to increase the length. Likewise, if the ratio is smaller than desired, scale by $(1-\alpha)$ to decrease the step. \item Goto \ref{it:reg:start} and repeat $M$ times. \end{enumerate} If $\alpha$ is too large, the results will be unstable as $\delta_x$ oscillates in and out of the acceptable range. Too small and convergence will take a very long time. Similar observations can be made about $k$. I found acceptable results with $\alpha=0.05$ and $k=1000$. Using this plan for picking $\delta_x$ I achieved an acceptance ratio of 82.30 \% with a scaling factor of 0.2900 $a_0$ after $1.1 \times 10^6$ total steps. This step size was then use for sampling the roaming region. All of my parameters are given in the table below. \begin{center} \begin{tabular}{l l} Parameter & Value \\ \hline ${\delta}_x$ (guess) & 0.01 $a_0$ \\ $k$ & 1000 \\ target acceptance ratio & 10\% to 90\% \\ $\alpha$ & 0.05 \\ $M$ & 1100 \\ \hline ${\delta}_x$ (final) & 0.2900 $a_0$ \\ acceptance ratio & 82.30\% \end{tabular} \end{center} \section{Results} The objects in section \ref{subsec:details} are general and, desiring to exploit this, I implemented them in C in such a way that arbitrary probability functions and propagators could be used\footnote{This was certainly no more time-consuming than a one-off implementation because it forced me to structure my functions in an organized way.}. \subsection{A simple test} As a test of my implementation, I sampled the set defined by: \begin{equation} P(\vec{x}) = \Theta(1 - \norm{\vec{x}}) \end{equation} in 2 and 3 dimensions. This is, of course, the unit-ball. I used all the other procedures described in section \ref{sec:implementation} and obtained an acceptance ratio of 37.18 \% in generating a chain of $10^4$ steps using a step size of $\delta_x = 4.62884$. A plot of my results in 2 dimensions appears in figure \ref{fig:disk} \begin{figure} \begin{center} \includegraphics[width=.6\textwidth]{disk} \caption{Test of the Markov chain sampling scheme in the unit disk}{\label{fig:disk}} \end{center} \end{figure} I also computed correct-looking distributions for an annulus in 2 and 3 dimensions: \begin{equation} P(\vec{x}) = \Theta \left(1 - \norm{\vec{x}} \right) \cdot \Theta \left(\norm{\vec{x}} - \frac{1}{2} \right) \end{equation} and a variable-density version of the ball: \begin{equation} P(\vec{x}) = \Theta(1 - \norm{\vec{x}}) \cdot \norm{\vec{x}}^{\alpha} \end{equation} where $\alpha=1,2$ in 2 and 3 dimensions. \FloatBarrier \subsection{The Formaldehyde Potential} Given that there is no reason to expect that the roaming region is self-connected, I wanted to use many seed values for the Markov chain. To this end, I collected all points from all 37600 cm$^{-1}$ MD trajectories subject to the following constraints: \begin{itemize} \item trajectories terminated within 12.1 ps as molecular products \item the points satisfied $V(\vec{x}) < 0.1591467 E_H$; this was the landscape energy used in my first analysis. \item points were in the ``roaming region'' as defined in section \ref{subsec:pfunc} \end{itemize} These criteria yielded 1143 points in the roaming region. Using each of the points as a different value for $x_0$, I generated Markov chains of length $10^5$, recording values every 100 moves. This gave a list of $1.143 \times 10^6$ points in the roaming region with an average acceptance ratio $83.7742 \pm 0.7272 \%$). These points were then shuffled and used as intermediate roaming points in the geodesic path-finding algorithm. A few visualizations of the generated points follow. The figures show the position of the roaming hydrogen in a reference frame such that the origin is the formyl center of mass, the \ce{HCO} plane is the $xy$ plane and the \ce{CO} axis is parallel to the $y$ axis. Red coloration indicates attraction to formyl group while blue, repulsion. The figures are promising because they seem to indicate that the algorithm is behaving as it should\footnote{Yes, this is quite weak, what tests should/can I do to verify their validity?}. The case of the single chain depicted in figure \ref{fig:consec} shows the random walker beginning to explore a small arc of the region. In figure \ref{fig:mcall}, we see that the region appears to be uniformly sampled. \begin{figure}[H] \begin{center} \includegraphics[width=\textwidth]{figs/MCTestConsecutive} \caption{\label{fig:consec} $2.5 \times 10^3$ randomly selected points from a single $10^4$ state Markov chain on the formaldehyde PES.} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=\textwidth]{figs/MCAll} \caption{\label{fig:mcall} $2.5 \times 10^3$ randomly selected points from the $1.14 \times 10^6$ roaming points generated on the formaldehyde PES.} \end{center} \end{figure} \bibliographystyle{JAmChemSoc} \bibliography{../bibs/general,../bibs/roaming,../bibs/geodesics,../bibs/semiclassical,../bibs/self} \end{document}