From d29370edbb0eeb56ec3aabff437c048b7d9ee178 Mon Sep 17 00:00:00 2001 From: Blaise Thompson Date: Mon, 26 Mar 2018 17:49:30 -0500 Subject: 2018-03-26 17:49 --- processing/chapter.tex | 284 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 245 insertions(+), 39 deletions(-) (limited to 'processing/chapter.tex') diff --git a/processing/chapter.tex b/processing/chapter.tex index 94ef12e..4f8f029 100644 --- a/processing/chapter.tex +++ b/processing/chapter.tex @@ -378,38 +378,129 @@ operations, as I will describe below. % Currently the artists sub-package is built on-top of the wonderful matplotlib library. % In the future, other libraries (e.g. Mayavi \cite{Mayavi}), may be incorporated. % -\subsection{Colormaps} % ------------------------------------------------------------------------- +\subsection{Strategies for 2D visualization} % --------------------------------------------------- -% TODO: figure like made by visualize_colormap_components +Representing two-dimensional data is an important capability for WrightTools, so some special +discussion about how such representations work is warranted. % +WrightTools data is typically very structured, with values recorded at a grid of positions. % +To represent two-dimensional data, then, WrightTools needs to map the values onto a color axis. % +There are better and worse choices of colormap... % TODO: elaborate + +\subsubsection{Colormap} + +\begin{figure} + \includegraphics[scale=0.5]{"processing/wright_cmap"} + \includegraphics[scale=0.5]{"processing/cubehelix_cmap"} + \includegraphics[scale=0.5]{"processing/viridis_cmap"} + \includegraphics[scale=0.5]{"processing/default_cmap"} + \caption[CAPTION TODO]{ + CAPTION TODO} + \label{pro:fig:cmaps} +\end{figure} + +\begin{figure} + \includegraphics[width=\textwidth]{"processing/cmap_comparison"} + \caption[CAPTION TODO]{ + CAPTION TODO} + \label{pro:fig:cmap_comparison} +\end{figure} + +\autoref{pro:fig:cmaps} shows the red, green, and blue components of four different colormaps. % +The black line is the net intensity of each color (larger value means lighter color). % +Below each figure is a gray-scale representation of the corresponding colormap. % +The r, g, and b components are scaled according to human perception. % TODO: values, from where +The traditional Wright Group colormap (derived from jet) is shown first. % TODO: cite jet +It is not perceptual... % TODO: define perceptual +Following are two perceptual colormaps, cubehelix from Green % TODO: cite +and viridis, the new matplotlib default % TODO: cite +WrightTools uses the algorithm from Green to define a custom cubehelix colormap with good +perceptual properties and familiar Wright Group coloration. % % TODO: figure like one on wall % TODO: mention isoluminant -\subsection{Interpolation} % --------------------------------------------------------------------- +\subsubsection{Interpolation type} + +WrightTools data is defined at discrete points, but an entire 2D surface must be defined in order +to make a full colored surface. % +Defining this surface requires \emph{interpolation}, and there are various strategies that have +different advantages and disadvantages. % +Choosing the wrong type of interpolation can be misleading. % + +In the multidimensional spectroscopy community, the most popular form of interpolation is based on +deulaney... -% TODO: fill types figure from wright.tools +\begin{figure} + \includegraphics[width=\textwidth]{"processing/fill_types"} + \caption[CAPTION TODO]{ + CAPTION TODO} + \label{pro:fig:fill_types} +\end{figure} \subsection{Quick} % ----------------------------------------------------------------------------- -\subsubsection{1D} +To facilitate easy visualization of data, WrightTools offers ``quick'' artist functions which +quickly generate 1D or 2D representations. % +These functions are made to make good representations by default, but they do have certain keyword +arguments to make popular customization easy. % +These are particular useful functions within the context of repls and auto-generated plots in +acquisition software. % + +Default outputs of \python{wt.artists.quick1D} and \python{wt.artists.quick2D} are shown in +\autoref{pro:fig:quick1D} and \autoref{pro:fig:quick2D}, respectively. % +The full script used to create each image is included in the Figures. % +Note that the actual quick functions are each one-liners, and that the supplied keyword arguments +are necessary only because the images are being saved (not typical for users in interactive +mode). % + +Perhaps the most powerful feature of \python{quick1D} and \python{quick2D} are their ability to +treat higher-dimensional datasets by automatically generating multiple figures. % +When handing a dataset of higher dimensionality to these artists, the user may choose which axes +will be plotted against using keyword arguments. % +Any axis not plotted against will be iterated over such that an image will be generated at each +coordinate in that axis. % +Users may also provide a dictionary with entries of the form +\python{{axis_name: [position, units]}} to choose a single coordinate along non-plotted axes. % +These functionalities are derived from \python{wt.Data.chop}, discussed further in... % TODO: link \begin{figure} \includegraphics[width=0.5\textwidth]{"processing/quick1D 000"} \includepython{"processing/quick1D.py"} - \caption[CAPTION TODO] - {CAPTION TODO} + \caption[CAPTION TODO]{ + CAPTION TODO} + \label{pro:fig:quick1D} \end{figure} -\subsubsection{2D} - \begin{figure} \includegraphics[width=0.5\textwidth]{"processing/quick2D 000"} \includepython{"processing/quick2D.py"} - \caption[CAPTION TODO] - {CAPTION TODO} + \caption[CAPTION TODO]{ + CAPTION TODO} + \label{pro:fig:quick1D} \end{figure} +% TODO: signed data (with and without dynamic_range=True) + +\subsection{Specialty} % ------------------------------------------------------------------------ + +\subsection{API} % ------------------------------------------------------------------------------- + +The artists sub-package offers a thin wrapper on the default matplotlib object-oriented figure +creation API. % +The wrapper allows WrightTools to add the following capabilities on top of matplotlib: +\begin{ditemize} + \item More consistent multi-axes figure layout. + \item Ability to plot data objects directly. +\end{ditemize} +Each of these is meant to lower the barrier to plotting data. % +Without going into every detail of matplotlib figure generation capabilities, this section +introduces the unique strategy that the WrightTools wrapper takes. % + +\subsection{Gotchas} % --------------------------------------------------------------------------- + +% TODO: mention gotcha of apparently narrowing linewidths with wigners (how to READ colormaps) + \section{Variables and channels} % =============================================================== Data objects are made up of many component channels and variables, each array having the same @@ -454,10 +545,8 @@ From a quick inspection, one can see that \python{w1} and \python{wm} were scann \python{w3}, \python{d0}, and \python{d1} were not moved at all, yet their coordinates are still propagated. % - \section{Axes} % ================================================================================= - The axes have the joint shape of their component variables. % Although not shown in this example, channels also may have axes with length 1. @@ -472,14 +561,15 @@ squeezed and broadcasted array, respectively. % CAPTION TODO} \end{figure} - \section{Math} % ================================================================================= Now that we know the basics of how the WrightTools \python{Data} class stores data, it's time to do some data manipulation. % Let's start with some elementary algebra. % -\subsection{In-place operators} +% TODO: mention chunkwise strategy + +\subsection{In-place operators} % ---------------------------------------------------------------- In Python, operators are symbols that carry out some computation. % Consider the following: @@ -529,7 +619,7 @@ data.created at /tmp/tdyvfxu8.wt5::/ range: 2500.0 to 700.0 (nm) size: 1801 >>> data.signal - + >>> data.signal.min(), data.signal.max() (0.10755, 1.58144) >>> data.signal /= 2 @@ -538,17 +628,83 @@ data.created at /tmp/tdyvfxu8.wt5::/ \end{codefragment} Variables also support in-place operators. % -\subsection{Clip} +\subsection{Clip} % ------------------------------------------------------------------------------ + +Clip allows users to exclude values outside of a certain range. % +This can be particularly useful in cases like fitting. % +See section ... for an example. % TODO: link to section + +It's also useful for when noise in a certain region of a spectrum obscures useful data... +Particularly true for normalized and signed data. % + +\subsection{Symmetric root} % -------------------------------------------------------------------- + +Homodyne vs heterodyne-detected data need to be scaled appropriately for comparison. % +Much of the data that we collect in the Wright Group is homodyne detected, so it goes as $N^2$. % +To compare with the majority of other experiments, including basic linear experiments like +absorption and Raman spectroscopy, need to plot on ``amplitude level'', that is +$\mathsf{amplitude=\sqrt{signal}}$. % + +Due to things like leveling, chopping, baseline subtraction, and simple noise even homodyne +detected data typically include negative numbers. % +Symmetric root treats these values as cleanly as possible by applying the same relative scaling to +positive and negative values, and keeping the sign of each pixel, as the following psudocode +shows. % +\begin{codefragment}{python} +def symmetric_root(value): + return sign(value) * sqrt(abs(value)) +\end{codefragment} + +For generality, \python{wt.Channel.symmetric_root} accepts any root as an argument. % +The default is 2, for the common case of going from intensity scaling to amplitude scaling. % -% TODO +Any other power can be applied to a channel using the in-place \python{**=} syntax. % -\subsection{Symmetric root} +\subsection{Log} % ------------------------------------------------------------------------------- -% TODO +The method \python{wt.Channel.log} applies logarithmic scaling to a channel. % +The base of the log is settable by keyword argument, with a default of $\me$. % +There are also methods \python{wt.Channel.log10} and \python{wt.Channel.log2}, which accept no +keyword arguments. % +These may be slightly faster than \python{channel.log(base=10)} and +\python{channel.log(base=2)}. % -\subsection{Log} +\subsection{Level} % ----------------------------------------------------------------------------- -% TODO +% TODO: figure from wright.tools + +\subsection{Trim} % ------------------------------------------------------------------------------ + +Trim uses statistical treatment to find and remove outliers from a dataset. % +It is useful in cases where the naive strategy employed by \python{wt.Channel.clip} is not +sufficient, and when preparing for fitting. % + +Currently \python{trim} only supports one statistical treatment: the z-test. % +Z-testing compares each pixel to its multidimensional neighborhood of pixels. % +If the pixel is more than $n$ standard deviations outside of the neighborhood mean (using the +neighborhood standard deviation) it is either masked, replaced with \python{np.nan}, or replaced +with the neighborhood mean. % +All outliers are found before any outliers are modified, so the algorithm is not directional. % + +% TODO: z-test citation + +\python{wt.Channel.trim} can easily be enhanced with other statistical methods as needed. % + +\subsection{Smooth} % ---------------------------------------------------------------------------- + +\python{wt.Channel.smooth} essentially passes the channel through a low-pass filter. % +It does this by convolving the channel with an n-dimensional Kaiser–Bessel window. % + +% TODO: define Kaiser window +% TODO: citations +% TODO: motivate use of Kaiser window over other choices + +Smoothing is a highly destructive process, and can be very dangerous if used unthinkingly. % +However it can be useful when noisy data is collected in high resolution. % +By taking many more pixels than required to capture the relevant spectral or temporal features, one +can confidently smooth collected data in post to achieve clean results. % +This strategy is similar to that accomplished in time domain CMDS where a low-pass filter is +applied on the very high resolution raw data. % \section{Dimensionality manipulation} % ========================================================== @@ -560,7 +716,7 @@ Also consider using the fit sub-package. % TODO: more info, link to section Chop is one of the most important methods of data, although it is typically not called directly by users of WrightTools. % Chop takes n-dimensional data and ``chops'' it into all of it's lower dimensional components. % -Consider a 3D dataset in \python{('wm', 'w2', 'w1''''')}. % +Consider a 3D dataset in \python{('wm', 'w2''', 'w1''''')}. % This dataset can be chopped to it's component 2D \python{('wm'', 'w1')} spectra. % \begin{codefragment}{python, label=test_label} >>> import WrightTools as wt; from WrightTools import datasets @@ -607,33 +763,81 @@ This same syntax used in artists... % TODO \subsection{Collapse} % -------------------------------------------------------------------------- +\python{wt.Data.collapse} reduces the dimensionality of the data object by exactly 1 using some +mathematical operation. % +Currently supported methods are integrate, average, sum, max, and min, with integrate as +default. % +Collapsing a dataset is a very simple and powerful method of dimensionality reduction. % +It allows users to inspect the net dependency along a set of axes, without being opinionated about +the coordinate in other dimensions. % +It can also be used as a method of noise reduction. % + \subsection{Split} % ----------------------------------------------------------------------------- +\python{wt.Data.split} is not a proper method of dimensionality reduction, but it is a crucial tool +for interacting with the dimensionality of a data object. % +\python{split} allows users to access a portion of the dataset. % +The most common use-case is certainly in fitting operations. % +In population spectroscopies like transient absorption and transient grating it has become typical +to take three-dimensional ``movies'' in \python{('w1', 'w2', 'd2')}, where \python{w1} is a probe, +\python{'w2'} is a pump, and \python{'d2'} is a population delay. % +It can be informative to fit each \python{d2} trace to a model (often single exponential), but such +a fit will not do well to describe the signal through zero delay and for positive \python{d2} +values (into the coherence pathways). % +\python{data.split(d2=0.)} will return two data objects, one for the positive delays and one for +negative. % +You can then pass the data object with only population response into your fitting routine. % + \subsection{Join} % ------------------------------------------------------------------------------ -\section{Specialty visualizations} % ============================================================= +Like \python{split}, \python{wt.data.join} is not a method of dimensionality reduction. % +It is also not a method of the \python{Data} class, it is a bare function. % +Join accepts multiple data objects and attempts to join them together. % +To do this, the variable and channel names must agree. % -\subsection{Specialty} % ------------------------------------------------------------------------- +% TODO: join example -\subsection{Artists API} % ----------------------------------------------------------------------- +\section{Fitting} % ============================================================================== -The artists sub-package offers a thin wrapper on the default matplotlib object-oriented figure -creation API. % -The wrapper allows WrightTools to add the following capabilities on top of matplotlib: -\begin{ditemize} - \item More consistent multi-axes figure layout. - \item Ability to plot data objects directly. -\end{ditemize} -Each of these is meant to lower the barrier to plotting data. % -Without going into every detail of matplotlib figure generation capabilities, this section -introduces the unique strategy that the WrightTools wrapper takes. % +Like the rest of WrightTools, the \python{fit} sub-package is made to play as nicely as possible +with high-dimensional data. % +WrightTools uses fitting as a method of dimensionality reduction. % +For example, consider a three-dimensional \python{('w1', 'w2', 'd2')} ``movie'', where \python{d2} +is a population delay that can be well approximated by a single exponential decay with offset. % +Rather than attempt to visualize \python{w1, w2} at some specific value of \python{d2}, it can be +powerful to instead consider the parameters (amplitude, offset, and time constant) of an +exponential fit at each \python{w1, w2} coordinate. % +On a more practical note, this kind of slice-by-slice dimensionality reduction via fitting can +greatly simplify automated instrumental calibration (see ...) % TODO: link to opa chapter +WrightTools employs some simple tricks to enable these kind of fit operations, described here. % -% TODO: finish discussion +% TODO: consider inserting figures that demonstrate this story (need to use wt2?) -\section{Fitting} % ============================================================================== +\subsection{Function objects} % ------------------------------------------------------------------ + +One challenge of slice-by-slice fitting is making a good intial guess to optimize from. % +It is not tractable to ask the user to provide a guess for each slice, so some kind of reasonable +automated guessing must be used. % +WrightTools ``function'' objects are self contained describers of a particular function. % +As an example, consider the \python{wt.fit.Expontial} class... +It has parameters... +Fit... +Evaluate... +Guess... + +Can be used directly... + +\subsection{Fitter} % ---------------------------------------------------------------------------- + +Loops through... +Returns model and outs... \section{Construction and maintenance} % ========================================================= +\subsection{Collaborative development} % --------------------------------------------------------- + +\subsection{Version control} % ------------------------------------------------------------------- + \subsection{Unit tests} % ------------------------------------------------------------------------ \section{Distribution and licensing} \label{pro:sec:disbribution} % ============================== @@ -642,4 +846,6 @@ WrightTools is MIT licensed. % WrightTools is distributed on PyPI and conda-forge. -\section{Future directions} % ==================================================================== \ No newline at end of file +\section{Future directions} % ==================================================================== + +Single variable decomposition. \ No newline at end of file -- cgit v1.2.3