2018-03-31 12:08

author: Blaise Thompson <blaise@untzag.com> 2018-03-31 12:08:35 -0500
committer: Blaise Thompson <blaise@untzag.com> 2018-03-31 12:08:35 -0500
commit: e40e84ad9b891c96ffe7cda884087c0b9dc098c7 (patch)
tree: d3b1d472553e26d5a4b25834fdb93c9e4fda7689 /processing/chapter.tex
parent: 548cc56e7b65184d1e10a26711837e18f189c136 (diff)
1 files changed, 135 insertions, 86 deletions
diff --git a/processing/chapter.tex b/processing/chapter.tex
index db8da3b..81886c2 100644
--- a/processing/chapter.tex
+++ b/processing/chapter.tex
@@ -23,19 +23,49 @@
 
 \clearpage
 
-From a data science perspective, CMDS has several unique challenges:
+CMDS takes a somewhat unique approach to instrumental science.  %
+There are not that many well-defined, well-trodden experimental paths.  %
+The basic ideas stay the same, but the real power is in the creativity and flexibility to tweak the
+experiment according to the particular question being asked.  %
+How, then could one go about making a data processing software package for CMDS?  %
+The package has to be flexible enough to accommodate the diversity of experiments, but still solid
+enough to be a foundational tool.  %
+
+When creating a toolkit for CMDS, there are several challenges worth considering:
 \begin{ditemize}
   \item Dimensionality of datasets can typically be greater than two, complicating
     \textbf{representation}.
-  \item Shape and dimensionality change...
-  \item Data can be large (over one million points).  % TODO: contextualize large (not BIG DATA)
+  \item Shape and dimensionality change, and relevant axes can be different from the scanned
+    dimensions.  %
+  \item Data can be awkwardly large-ish (several million pixels), and can become legitimately large
+    in numerical simulations.  %
+  \item There are no agreed-upon file formats for CMDS dataset storage.  %
 \end{ditemize}
-I have designed a software package that directly addresses these issues.  %
-
-WrightTools is a software package at the heart of all work in the Wright Group.  %
-
-% TODO: more intro
+The biggest challenge is to find a really good definition for what constitutes a CMDS dataset.  %
+Once understood, this common denominator can be enshrined into software and built upon.  %
+
+WrightTools is a software package written in Python, built using the excellent tools provided by
+the scientific Python collection of packages, especially Scipy and Numpy.  % TODO: cite cite cite
+WrightTools defines a universal file-format that is flexible enough to encompass the diversity of
+CMDS while still being entirely self-describing.  %
+This file format is based on the popular binary format ``HDF5''.  % TODO: cite
+This format allows for computers to interact with the arrays piece-by-piece in a very fast and
+reliable way, without loading the entire array in and out of memory.  %
+WrightTools piggybacks on this, allowing users to interact with legitimately large CMDS datasets
+without worrying about memory overflow.  %
+WrightTools takes a unique approach to representing CMDS data in array format, nick-named
+``semi-structure'', that allows for greater flexibility in representing CMDS in different
+coordinate spaces.  %
+
+WrightTools is written to be used in scripts and in the command line.  %
+It does not have any graphical components built in, except for the ability to generate plots using
+matplotlib.  % TODO: cite
+Being built in this way gives WrightTools users maximum flexibility, and allows for rapid
+collaborative development.  %
+It also allows other software packages to use WrightTools as a ``back-end'' foundational software,
+as has already been done in simulation and acquisition software created in the Wright Group.  %
 
+\clearpage
 \section{Introduction to WrightTools}  % ==========================================================
 
 WrightTools is written in Python, and endeavors to have a ``pythonic'', explicit and ``natural''
@@ -47,7 +77,7 @@ To use WrightTools, simply import:
 3.0.0
 \end{codefragment}
 I'll discuss more about how exactly WrightTools packaging, distribution, and instillation works in
-\autoref{sec:processing_distbribution}.
+\autoref{pro:sec:processing_distribution}.
 
 We can use the builtin Python function \python{dir} to interrogate the contents of the
 WrightTools package.  %
@@ -117,75 +147,86 @@ spectrum, and channels are each a particular kind of signal within that spectrum
 Typical variables might be \python{[w1, w2, w3, d1, d2]}, and typical channels
 \python{[pmt, pyro1, pyro2, pyro3]}.  %
 
-As an overview, the following lexicographically lists the attributes and methods of
-\python{Data}.  %
-\begin{ditemize}
-  \item method \python{collapse}: Collapse along one dimension in a well-defined way.
-  \item method \python{convert}: Convert all axes of a certain kind.
-  \item method \python{create_channel}: Create a new channel.
-  \item method \python{create_variable}: Create a new variable.
-  \item method \python{fullpath}
-  \item method \python{get_nadir}
-  \item method \python{get_zenith}
-  \item method \python{heal}
-  \item attribute \python{kind}
-  \item method \python{level}
-  \item method \python{map_variable}
-  \item attribute \python{natural_name}
-  \item attribute \python{ndim}
-  \item method \python{offset}
-  \item method \python{print_tree}
-  \item method \python{remove_channel}
-  \item method \python{remove_variable}
-  \item method \python{rename_channels}
-  \item method \python{rename_variables}
-  \item attribute \python{shape}
-  \item method \python{share_nans}
-  \item attribute \python{size}
-  \item method \python{smooth}
-  \item attribute \python{source}
-  \item method \python{split}
-  \item method \python{transform}
-  \item attribute \python{units}
-  \item attribute \python{variable_names}
-  \item attribute \python{variables}
-  \item method \python{zoom}
-\end{ditemize}
+\begin{table}
+  \begin{tabular}{c | c | l}
+    & type & description \\ \hline
+    \python{collapse} & method & Collapse along one dimension in a well-defined way. \\ \hline
+    \python{convert} & method & Convert all axes of a certain kind. \\ \hline
+    \python{create_channel} & method & Create a new channel. \\ \hline
+    \python{create_variable} & method & Create a new variable. \\ \hline
+    \python{fullpath} & attribute & \\ \hline
+    \python{get_nadir} & & \\ \hline
+    \python{get_zenith} & & \\ \hline
+    \python{heal} & & \\ \hline
+    \python{kind} & & \\ \hline
+    \python{level} & & \\ \hline
+    \python{map_variable} & & \\ \hline
+    \python{natural_name} & & \\ \hline
+    \python{ndim} & & \\ \hline
+    \python{offset} & & \\ \hline
+    \python{print_tree} & & \\ \hline
+    \python{remove_channel} & & \\ \hline
+    \python{remove_variable} & & \\ \hline
+    \python{rename_channels} & & \\ \hline
+    \python{shape} & & \\ \hline
+    \python{share_nans} & & \\ \hline
+    \python{size} & & \\ \hline
+    \python{smooth} & & \\ \hline
+    \python{source} & & \\ \hline
+    \python{split} & & \\ \hline
+    \python{transform} & & \\ \hline
+    \python{units} & & \\ \hline
+    \python{variable_names} & & \\ \hline
+    \python{variables} & & \\ \hline
+    \python{zoom} & & \\ \hline
+  \end{tabular}
+  \caption[Attributes and methods of Data]{
+    Key attributes and methods of data, lexicographically listed
+  }
+\end{table}
 
 Each data object contains instances of \python{Channel} and \python{Variable} which represent the
 principle multidimensional arrays.  %
 The following lexicographically lists the attributes of these instances.  %
 Certain methods and attributes are unique to only one type of dataset, and are marked as such.  %
-\begin{ditemize}
-  \item method \python{argmax}
-  \item method \python{argmin}
-  \item method \python{chunkwise}
-  \item method \python{clip}
-  \item method \python{convert}
-  \item attribute \python{full}
-  \item attribute \python{fullpath}
-  \item attribute \python{label} (variable only)
-  \item method \python{log}
-  \item method \python{log10}
-  \item method \python{log2}
-  \item method \python{mag}
-  \item attribute \python{major_extent} (channel only)
-  \item method \python{max}
-  \item method \python{min}
-  \item attribute \python{minor_extent} (channel only)
-  \item attribute \python{natural_name}
-  \item method \python{normalize} (channel only)
-  \item attribute \python{null} (channel only)
-  \item attribute \python{parent}
-  \item attribute \python{points}
-  \item attribute \python{signed} (channel only)
-  \item method \python{slices}
-  \item method \python{symmetric_root}
-  \item method \python{trim} (channel only)
-\end{ditemize}
+
 Channels and variables also support direct indexing / slicing using \python{__getitem__}, as
 discussed more in...  % TODO: where is it discussed more?
- 
+
+\begin{table}
+  \begin{tabular}{c | c | c | l}
+    & type & of & description \\ \hline
+    \python{argmax} & method & both & \\ \hline
+    \python{argmin} & & & \\ \hline
+    \python{chunkwise} & & & \\ \hline
+    \python{clip} & & & \\ \hline
+    \python{convert} & & & \\ \hline
+    \python{full} & & & \\ \hline
+    \python{fullpath} & & & \\ \hline
+    \python{label} & attribute & variable & \\ \hline
+    \python{log} & & & \\ \hline
+    \python{log10} & & & \\ \hline
+    \python{log2} & & & \\ \hline
+    \python{mag} & & & \\ \hline
+    \python{major_extent} & attribute & channel & \\ \hline
+    \python{max} & & & \\ \hline
+    \python{min} & & & \\ \hline
+    \python{minor_extent} & attribute & channel & \\ \hline
+    \python{natural_name} & & & \\ \hline
+    \python{normalize} & & channel & \\ \hline
+    \python{null} & & channel & \\ \hline
+    \python{parent} & & & \\ \hline
+    \python{points} & & & \\ \hline
+    \python{signed} & & channel & \\ \hline
+    \python{slices} & & & \\ \hline
+    \python{symmetric_root}
+    \python{trim} & & channel & \\ \hline
+  \end{tabular}
+  \caption[Attributes and methods of Channel and Variable.]{
+    Key attributes and methods of channel and variable, lexicographically listed
+  }
+\end{table}
+
 Axes are ways to organize data as functional of particular variables (and combinations thereof).  %
 The \python{Axis} class does not directly contain the respective arrays---it merely refers to the
 associated variables.  %
@@ -199,20 +240,28 @@ include \python{'w1'}, \python{'w1=wm'}, \python{'w1+w2'}, \python{'2*w1'}, \pyt
 Axes can be directly indexed / sliced into using \python{__getitem__}, and they support many of the
 ``numpy-like'' attributes.  %
 A lexicographical list of axis attributes and methods follows.
-\begin{ditemize}
-  \item attribute \python{full}
-  \item attribute \python{label}
-  \item attribute \python{natural_name}
-  \item attribute \python{ndim}
-  \item attribute \python{points}
-  \item attribute \python{shape}
-  \item attribute \python{size}
-  \item attribute \python{units_kind}
-  \item attribute \python{variables}
-  \item method \python{convert}
-  \item method \python{min}
-  \item method \python{max}
-\end{ditemize}  % TODO: actually lexicographical
+
+
+\begin{table}
+  \begin{tabular}{c | c | l}
+    & type & description \\ \hline
+    \python{full} & & \\ \hline
+    \python{label} & & \\ \hline
+    \python{natural_name} & & \\ \hline
+    \python{ndim} & & \\ \hline
+    \python{points} & & \\ \hline
+    \python{shape} & & \\ \hline
+    \python{size} & & \\ \hline
+    \python{units_kind} & & \\ \hline
+    \python{variables} & & \\ \hline
+    \python{convert} & & \\ \hline
+    \python{min} & & \\ \hline
+    \python{max} & & \\ \hline
+  \end{tabular}
+  \caption[Attributes and methods of Axis.]{
+    Key attributes and methods of axis, lexicographically listed
+  }
+\end{table}
 
 \section{Creating a data object}  % ===============================================================
 
@@ -309,7 +358,7 @@ Conceptually, it behaves like a folder in a traditional file-system.  %
 
 The primary attributes and methods of \python{Collection} are
 \begin{ditemize}
-  \item attribute item_names
+  \item attribute \python{item_names}
   \item attribute \python{fullpath}
 \end{ditemize}
 % TODO: finish adding attributes and methodsd
author	Blaise Thompson <blaise@untzag.com>	2018-03-31 12:08:35 -0500
committer	Blaise Thompson <blaise@untzag.com>	2018-03-31 12:08:35 -0500
commit	e40e84ad9b891c96ffe7cda884087c0b9dc098c7 (patch)
tree	d3b1d472553e26d5a4b25834fdb93c9e4fda7689 /processing/chapter.tex
parent	548cc56e7b65184d1e10a26711837e18f189c136 (diff)