\chapter{Acquistion} \label{cha:acq} % TODO: cool quote, if I can think of one \begin{dquote} The question of software correctness ultimately boils down to, “Does it do what we have in our minds, even the things we have not gotten around to thinking about yet?” \dsignature{Alistair Cockburn} \end{dquote} \clearpage In the Wright Group, \gls{PyCMDS} replaces the old acquisition softwares `ps control', written by Kent Meyer and `Control for Lots of Research in Spectroscopy' written by Schuyler Kain. PyCMDS directly addresses the hardware during experiments. \section{Overview} % ============================================================================= PyCMDS has, through software improvements alone, dramatically lessened scan times... \begin{ditemize} \item simultaneous motor motion \item digital signal processing % TODO: reference section when it exists \item ideal axis positions \ref{acq:sec:ideal_axis_positions} \end{ditemize} % TODO: screenshot % TODO: modularity % TODO: n-dimensional scans... % TODO: calibration \section{Structure} % ============================================================================ I think of PyCMDS as a central program with three kinds of modular ``plugins'' that can be extended indefinitely: \begin{ditemize} \item Hardware: things that can be set to a position (\autoref{aqn:sec:hardware}). \item Sensors: things that can be used to measure a signal (\autoref{aqn:sec:sensors}). \item Acquisition modules: things that can be used to define and carry out an acquisition, and associated post-processing (\autoref{aqn:sec:somatic}). \end{ditemize} The first design rule for PyCMDS is that these three things should be easy for the average (motivated) user to add by herself. % Modularity and extensibility is important for all software projects, but it is of paramount importance for acquisition software simply because the diversity of hardware and experimental configurations is so great. % It is conceivable to imagine 90\% overlap between the data processing and simulation needs of one spectroscopist to the next, but there is almost no overlap between hardware configurations even in the two primary instruments maintained by the Wright Group. % Besides the extendable modular pieces, the rest of PyCMDS is a mostly-static code-base that accepts modules and does the necessary things to handle display of information from, and communication between them. % \subsection{Multithreading} % -------------------------------------------------------------------- For the kinds of acquisitions that the Wright Group has done the acquisition software spends the vast majority of its run-time waiting---waiting for user input through mouse clicks or keyboard presses, waiting for hardware to finish moving or for sensors to finish reading and return signals. % Despite all of this downtime, it is crucial that the software respond very quickly when instructions or signals are recieved. % A multi-threaded implementation is necessary to achieve this. % The main thread handles the graphical user interface (GUI) and other top level things. % Everything else happens in child threads. % Each hardware instance (e.g. a delay stage) lives in its own thread, as does each sensor. % Since only one scan happens at a time, all acquisition modules share a single thread that handles the orchestration of hardware motion, sensor operation, and data processing that the chosen acquisition requires. % Threads are powerful because they allow for ``semi-synchronous'' operation. % Imagine PyCMDS is in the middle of a 2D delay-delay scan, and the scan thread has just told each of the two delay stages to head to their destinations. % PyCMDS must sit in a tight loop to keep track of the position as closely as possible during motor motion. % In a single-threaded configuration, this tight loop would only run for one delay at a time, such that PyCMDS would have to finish shepherding one delay stage before turning its attention to the second. % In a multi-threaded configuration, each thread will run simultaniously, switching off CPU cycles at a low level far faster than human comprehension. % This switching is handled in an OS and hardware specific way---luckily it is all abstracted through platform-agnostic Qt threads. % Threads are dangerous because it is hard to pass information between them. % Without any special protection, two threads have no reason not to simultaneously edit and read the same location in memory. % If a delay stage is writing its position to memory as a 64-bit double at the same time as the acquisition thread reads that memory address, the acquisition thread will read in nonsense (or worse), it will crash). % So some strategy is needed to ensure that threads respect each other. % The Mutex design allows threads to ``lock'' an object such that it cannot be modified by a different thread. % This lock is like the ``talking stick'' employed my many early child educators. % When the talking stick is being used, only the child that holds the stick is allowed to speak. % The stick must be passed to another child (as directed by the teacher) before they can share their thought. % PyCMDS makes heavy use of Mutexes, in particular the \bash{QMutex} class \cite{QMutex}. % Mutexes handle basic information transfer (two threads can both safely modify and read a particular object), but what about sending instructions between threads? % Here the problem is deciding what happens when multiple instructions are given simultaneously, or an instruction is given while another instruction is being carried out. % Again, this is a classic problem in computer science, and the classic answer is the queue. % Queues are just like lines at the coffee shop---each person (instruction) is served (carried out) in the order that they joined the line. % Queues are commonly referred to as FIFO (First In First Out) for this reason. % PyCMDS uses queues for almost all instructions. % Finally, PyCMDS makes extensive use of the ``signals and slots'' construct, which is somewhat unique (and certainly original) to Qt. % Signals and slots are powerful because they allow threads without instruction to go completely silent, making them essentially free in terms of CPU usage. % Normally, a thread needs to sit in a loop merely listening for instructions. % Within the Qt framework, a thread can be ``woken'' by a signal without needing that thread to explicitly ``listen''. % These concepts fit within the broader umbrella of ``event-driven programming'', a concept that has been used in many languages and frameworks (notably high level LabVIEW tends to be very event-driven). % The Qt signals and slots system massively simplifies programming within PyCMDS. % Note that multithreading is very different from multiprocessing. % \subsection{High level objects} % ---------------------------------------------------------------- PyCMDS is made to be extended and developed by and for immature programmers, so it is crucial to create something that is less complicated... At it's most basic PyCMDS defines the following simple data types (derived from \python{PyCMDS_object}): \begin{ditemize} \item Bool \item Combo \item Filepath \item Number \item String \end{ditemize} These classes do multiple things. % First, they \emph{are} Mutexes, with thread-safe \python{read} and \python{write} methods. % Secondly, they support ``implicit'' storage in ini files. % Third, they know how to participate in the GUI. % They can display their value, and if modified they will propagate that modification to the internal threads of outward... Finally, they have special properties like units and limits etc... Without getting into details, let's investigate the key ``signals and slots'' that hardware and sensors have. % % TODO: elaborate The following is the top-level hardware class, parent of all hardware and sensors. % \begin{figure} \includepython{"acquisition/parent_hardware.py"} \caption[Parent to hardware and sensors.]{ Parent class of all hardware and sensors. % For brevity, methods \python{close}, \python{update} and \python{wait_until_still} have been omitted. % } \label{aqn:fig:parent_hardware_class} \end{figure} \begin{figure} \includepython{"acquisition/driver.py"} \caption[TODO]{ TODO } \label{aqn:fig:driver} \end{figure} \subsection{Graphical user interface} % ---------------------------------------------------------- Made up of widgets... Table widget... Use of qt plots... pyqtgraph \cite{pyqtgraph} \subsection{Scans} % ----------------------------------------------------------------------------- The central loop of scans in PyCMDS. % \begin{codefragment}{python, label=aqn:lst:loop_simple} for coordinates in list_of_coordinates: for hardware, coordinate in zip(hardwares, coordinates): hardware.set(coordinate) for hardware in hardwares: hardware.wait_until_still() for sensor in sensors: sensor.read() for sensor in sensors: sensor.wait_until_done() \end{codefragment} \subsection{Conditional validity} % -------------------------------------------------------------- The central conceit of the PyCMDS modular hardware abstraction is that experiments can be boiled down to a set of orthogonal axes that can be set separately from other hardware and sensor status. % This requirement is loosened by the autonomic and expression systems, such that any experiment could \emph{probably} be forced into PyCMDS, but still the conceit stands---PyCMDS is probably \emph{not} the correct framework if your experiment cannot be reduced in this way. % From this we can see that it is useful to talk about the conditional validity of the modular hardware abstraction. % The important axis is hardware complexity vs measurement complexity. For hardware-complex problems, the challenge is coordination. % MR-CMDS is the perfect example of a hardware-complex problem. % MR-CMDS is composed of a collection (typically 5 to 10 members) of relatively simple hardware components. % The challenge is that experiments involve complex ``dances'', including expressions, of the component hardwares. % For measurement-complex problems, the challenge is, well, the measurement. % These are experiments where every little piece of the instrument is tied together into a complex network of inseparable parts. % These are often time-domain or ``single shot'' measurements. % Consider work of (GRAPES INVENTOR). % Such instruments are typically much faster at data acquisition and more reliable. % This comes at a price of flexibility: often such instruments cannot be modified or enhanced without touching everything. % From an acquisition software perspective, measurement-complex problems are not amenable to general purpose modular software design. % The instrument is so custom that it certainly requires entirely custom software. % Measurements can be neither hardware-complex nor software-complex (simple) or both (expensive). % Conceptually, can imagine a 4 quadrant system. % Thus PyCMDS can be proud to try and generalize the hardware-complex part of acquisition software because indeed that is all that can be generalized. % % TODO: 4 quadrants of complexity figure \section{Hardware} \label{aqn:sec:hardware} % ==================================================== Hardware are things that 1) have a position, 2) can be set to a destination. % Typically they also have associated units and limits. % Each hardware can be thought of as a dimension of the MR-CMDS experiment, and scans include a specific traversal through this multidimensional space. % \subsection{Hardware inheritance} % -------------------------------------------------------------- All hardware classes are children of the parent \python{Hardware} class, which is itself subclassed from the global \python{Hardware} class shown in \autoref{aqn:lst:parent_hardware}. % \begin{figure} \includepython{"acquisition/hardware.py"} \caption[Parent hardware class.]{ Parent class of all hardware. % For brevity, methods \python{close}, \python{get_destination}, \python{get_position}, \python{is_valid}, \python{on_address_initialized}, \python{poll}, and \python{@property units} have been omitted. % } \label{aqn:fig:hardware_class} \end{figure} \begin{figure} \includebash{"acquisition/hardware_inheritance"} \caption[Hardware inheritance.]{ } \label{aqn:fig:hardware_inheritance} \end{figure} \subsection{Delays} % ---------------------------------------------------------------------------- \subsection{Spectrometers} % --------------------------------------------------------------------- \subsection{OPAs} % ------------------------------------------------------------------------------ \subsection{Filters} % --------------------------------------------------------------------------- \section{Sensors (devices)} \label{aqn:sec:sensors} % ============================================ \subsection{Sensors as axes} % ------------------------------------------------------------------- \section{Autonomic} % ============================================================================ % TODO: concept of additive offsets \section{Somatic} \label{aqn:sec:somatic} % ======================================================= \subsection{Acquisition modules} % --------------------------------------------------------------- % TODO: the aqn file % TODO: list of modules, descriptions thereof \subsection{Queue manager} % --------------------------------------------------------------------- \subsection{The central loop of PyCMDS} % -------------------------------------------------------- \subsection{The data file} % --------------------------------------------------------------------- \subsection{Automatic processing} % -------------------------------------------------------------- \section{Future directions} % ==================================================================== \subsection{Spectral delay correction module} % -------------------------------------------------- \subsection{``Headless'' hardware, sensors} % ---------------------------------------------------- \subsection{Ideal Axis Positions} \label{acq:sec:ideal_axis_positions} % ------------------------- Frequency domain multidimensional spectroscopy is a time-intensive process. % A typical \gls{pixel} takes between one-half second and three seconds to acquire. % Depending on the exact hardware being scanned and signal being detected, this time may be mostly due to hardware motion or signal collection. % Due to the \gls{curse of dimensionality}, a typical three-dimensional CMDS experiment contains roughly 100,000 pixels. % CMDS hardware is transiently-reliable, so speeding up experiments is a crucial component of unlocking ever larger dimensionalities and higher resolutions. % One obvious way to decrease the scan-time is to take fewer pixels. % Traditionally, multidimensional scans are done with linearly arranged points in each axis---this is the simplest configuration to program into the acquisition software. % Because signal features are often sparse or slowly varying (especially so in high-dimensional scans) linear stepping means that \emph{most of the collected pixels} are duplicates or simply noise. % A more intelligent choice of axis points can capture the same nonlinear spectrum in a fraction of the total pixel count. % An ideal distribution of pixels is linearized in \emph{signal}, not coordinate. % This means that every signal level (think of a contour in the N-dimensional case) has roughly the same number of pixels defining it. % If some generic multidimensional signal goes between 0 and 1, one would want roughly 10\% of the pixels to be between 0.9 and 1.0, 10\% between 0.8 and 0.9 and so on. % If the signal is sparse in the space explored (imagine a narrow two-dimensional Lorentzian in the center of a large 2D-Frequency scan) this would place the majority of the pixels near the narrow peak feature(s), with only a few of them defining the large (in axis space) low-signal floor. % In contrast linear stepping would allocate the vast majority of the pixels in the low-signal 0.0 to 0.1 region, with only a few being used to capture the narrow peak feature. % Of course, linearizing pixels in signal requires prior expectations about the shape of the multidimensional signal---linear stepping is still an appropriate choice for low-resolution ``survey'' scans. % CMDS scans often posses correlated features in the multidimensional space. % In order to capture such features as cheaply as possible, one would want to define regions of increased pixel density along the correlated (diagonal) lineshape. % As a concession to reasonable simplicity, our acquisition software (PyCMDS) assumes that all scans constitute a regular array with-respect-to the scanned axes. % We can acquire arbitrary points along each axis, but not for the multidimensional scan. % This means that we cannot achieve strictly ideal pixel distributions for arbitrary datasets. % Still, we can do much better than linear spacing. % TODO: refer to PyCMDS/WrightTools 'regularity' requirement when that section exists Almost all CMDS lineshapes (in frequency and delay) can be described using just a few lineshape functions: \begin{ditemize} \item exponential \item Gaussian \item Lorentzian \item bimolecular \end{ditemize} Exponential and bimolecular dynamics fall out of simple first and second-order kinetics (I will ignore higher-order kinetics here). % Gaussians come from our Gaussian pulse envelopes or from normally-distributed inhomogeneous broadening. % The measured line-shapes are actually convolutions of the above. % I will ignore the convolution except for a few illustrative special cases. % More exotic lineshapes are possible in CMDS---quantum beating and breathing modes, for example---I will also ignore these. % Derivations of the ideal pixel positions for each of these lineshapes appear below. % TODO: cite Wright Group quantum beating paper, Kambempati breathing paper \subsubsection{Exponential} Simple exponential decays are typically used to describe population and coherence-level dynamics in CMDS. % For some generic exponential signal $S$ with time constant $\tau$, \begin{equation} \label{eq:simple_exponential_decay} S(t) = \me^{-\frac{t}{\tau}}. \end{equation} We can write the conjugate equation to \ref{eq:simple_exponential_decay}, asking ``what $t$ do I need to get a cerain signal level?'': \begin{eqnarray} \log{(S)} &=& -\frac{t}{\tau} \\ t &=& -\taulog{(S)}. \end{eqnarray} So to step linearly in $t$, my step size has to go as $-\tau\log{(S)}$. We want to go linearly in signal, meaning that we want to divide $S$ into even sections. % If $S$ goes from 0 to 1 and we choose to acquire $N$ points, \begin{eqnarray} t_n &=& -\tau\log{\left(\frac{n}{N}\right)}. \end{eqnarray} Note that $t_n$ starts at long times and approaches zero delay. % So the first $t_1$ is the smallest signal and $t_N$ is the largest. % Now we can start to consider realistic cases, like where $\tau$ is not quite known and where some other longer dynamics persist (manifested as a static offset). % Since these values are not separable in a general system, I'll keep $S$ normalized between 0 and 1. % \begin{eqnarray} S &=& (1-c)\me^{-\frac{t}{\tau_{\mathrm{actual}}}} + c \\ S_n &=& (1-c)\me^{-\frac{-\tau_{\mathrm{step}}\log{\left(\frac{n}{N}\right)}}{\tau_{\mathrm{actual}}}} + c \\ S_n &=& (1-c)\me^{-\frac{\tau_{\mathrm{step}}}{\tau_{\mathrm{actual}}} \log{\left(\frac{N}{n}\right)}} + c \\ S_n &=& (1-c)\left(\frac{N}{n}\right)^{-\frac{\tau_{\mathrm{step}}}{\tau_{\mathrm{actual}}}} + c \\ S_n &=& (1-c)\left(\frac{n}{N}\right)^{\frac{\tau_{\mathrm{step}}}{\tau_{\mathrm{actual}}}} + c \end{eqnarray} \begin{figure} \includegraphics[scale=0.5]{"processing/PyCMDS/ideal axis positions/exponential"} \caption[TODO]{TODO} \label{aqn:fig:exponential_steps} \end{figure} \subsubsection{Gaussian} \subsubsection{Lorentzian} \subsubsection{Bimolecular} \subsection{Simultanious acquisitions} % --------------------------------------------------------- \subsection{Enhanced modularity} % --------------------------------------------------------------- \subsection{wt5 savefile} % ---------------------------------------------------------------------- \subsection{Hotswappable hardware} % ------------------------------------------------------------- \subsection{Better logging and error handling} % -------------------------------------------------