From ac6bde61f90e6684b5b5b79286ebe58d08c09f9c Mon Sep 17 00:00:00 2001
From: Blaise Thompson <blaise@untzag.com>
Date: Mon, 19 Mar 2018 17:11:06 -0500
Subject: 2018-03-19 17:11

---
 dissertation.cls       |  13 ++-
 dissertation.pdf       | Bin 1007328 -> 1067876 bytes
 outline.org            |  10 +++
 processing/chapter.tex | 223 +++++++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 225 insertions(+), 21 deletions(-)

diff --git a/dissertation.cls b/dissertation.cls
index c22e3c8..2c6d168 100644
--- a/dissertation.cls
+++ b/dissertation.cls
@@ -59,19 +59,24 @@
 \newenvironment{denumerate}
   {
   \begin{enumerate}
-  \singlespacing  
+  \singlespacing
   }
   {
   \end{enumerate}
   }
 
+\setlist[itemize, 1]{nosep}
+\setlist[itemize, 2]{nosep, topsep=-5ex}
+\setlist[itemize, 3]{nosep, topsep=-5ex}
+\setlist[itemize, 4]{nosep, topsep=-5ex}
 \newenvironment{ditemize}
   {
-  \begin{enumerate}
+  \begin{itemize}
+  \renewcommand{\labelitemi}{$\rightarrow$}    
   \singlespacing  
   }
   {
-  \end{enumerate}
+  \end{itemize}
   }  
 
 % --- code environment ----------------------------------------------------------------------------
@@ -107,6 +112,8 @@
 \BeforeBeginEnvironment{codefragment}{\begin{singlespace}\stepcounter{equation}}
 \AfterEndEnvironment{codefragment}{\end{singlespace}}
 
+\newmintinline[python]{python}{bgcolor=bg}
+
 % --- tables --------------------------------------------------------------------------------------
 
 \newenvironment{dtable}
diff --git a/dissertation.pdf b/dissertation.pdf
index 791884c..75b916d 100644
Binary files a/dissertation.pdf and b/dissertation.pdf differ
diff --git a/outline.org b/outline.org
index e4990c4..0d66e26 100644
--- a/outline.org
+++ b/outline.org
@@ -6,6 +6,16 @@
 ** TODO cite:AubockGerald2012a
 * materials
 * software
+* processing
+** data object model
+*** creating a data object
+*** dimensionality manipulation
+*** math
+*** the wt5 file
+** artists
+** fitting
+** distribution and licensing
+** future directions
 * instrumental development
 * PbSe
 * MX2
diff --git a/processing/chapter.tex b/processing/chapter.tex
index 9c9ccab..0e0e5cb 100644
--- a/processing/chapter.tex
+++ b/processing/chapter.tex
@@ -17,7 +17,6 @@ WrightTools is a software package at the heart of all work in the Wright Group.
 
 % TODO: more intro
 
-
 WrightTools is written in Python, and endeavors to have a ``pythonic'', explicit and ``natural''
 application programming interface (API).  %
 To use WrightTools, simply import:
@@ -29,7 +28,7 @@ To use WrightTools, simply import:
 I'll discuss more about how exactly WrightTools packaging, distribution, and instillation works in
 \autoref{sec:processing_distbribution}.
 
-We can use the builtin Python function \mintinline{python}{dir} to interrogate the contents of the
+We can use the builtin Python function \python{dir} to interrogate the contents of the
 WrightTools package.  %
 \begin{codefragment}{python}
 >>> dir(wt)
@@ -59,53 +58,240 @@ WrightTools package.  %
  'kit',
  'open',
  'units']
-\end{codefragment}
+\end{codefragment}  % TODO: consider adding fit to this list
 Many of these are dunder (double underscore) attributes---Python internals that are not normally
 used directly.  %
 The ten attributes that do not start with underscore are the public API that users of WrightTools
 typically use.  %
-Within the public API are two classes, \mintinline{python}{Collection} \&
-\mintinline{python}{Data}, which are the two main classes in the WrightTools object model.  %
-\mintinline{python}{Data} stores spectra directly as multidimensional arrays, and
-\mintinline{python}{Collection} stores \textit{groups} of data objects (and other collection
+Within the public API are two classes, \python{Collection} \&
+\python{Data}, which are the two main classes in the WrightTools object model.  %
+\python{Data} stores spectra directly as multidimensional arrays, and
+\python{Collection} stores \textit{groups} of data objects (and other collection
 objects) in a hierarchical way for internal organization purposes.  %
 
 \section{Data object model}  % ====================================================================
 
 WrightTools uses a programming strategy called object oriented programming (OOP).  %
+% TODO: introduce HDF5
+% TODO: elaborate on the concept of OOP and how it relates to WrightTools
 
 It contains a central data ``container'' that is capable of storing all of the information about
-each multidimensional (or one-dimensional) spectra: the \mintinline{python}{Data} class.  %
-It also defines a \mintinline{python}{Collection} class that contains data objects, collection
+each multidimensional (or one-dimensional) spectra: the \python{Data} class.  %
+It also defines a \python{Collection} class that contains data objects, collection
 objects, and other pieces of metadata in a hierarchical structure.  %
 Let's first discuss \mitinline{python}{Data}.
 
 All spectra are stored within WrightTools as multidimensional arrays.  %
 Arrays are containers that store many instances of the same data type, typically numerical
 datatypes.  %
-These arrays have some \mintinline{python}{shape}, \mintinline{python}{size}, and
-\mintinline{python}{size}.  %
+These arrays have some \python{shape}, \python{size}, and
+\python{dtype}.  %
 In the context of WrightTools, they can contain floats, integers, complex numbers and NaNs.  %
 
-The \mintinline{python}{Data} class contains everything that is needed to define a single spectra
+The \python{Data} class contains everything that is needed to define a single spectra
 from a single experiment (or simulation).  %
 To do this, each data object contains several multidimensional arrays (typically 2 to 50 arrays,
 depending on the kind of data).  %
-There are two kinds of arrays, instances of \mintinline{python}{Variable} and
-\mintinline{python}{Channel}.  %
+There are two kinds of arrays, instances of \python{Variable} and \python{Channel}.  %
 Variables are coordinate arrays that define the position of each pixel in the multidimensional
 spectrum, and channels are each a particular kind of signal within that spectrum.  %
-Typical variables might be \mintinline{python}{[w1, w2, w3, d1, d2]}, and typical channels
-\mintinline{python}{[pmt, pyro1, pyro2, pyro3]}.  %
+Typical variables might be \python{[w1, w2, w3, d1, d2]}, and typical channels
+\python{[pmt, pyro1, pyro2, pyro3]}.  %
+
+As an overview, the following lexicographically lists the attributes and methods of
+\python{Data}.  %
+\begin{ditemize}
+  \item method \python{collapse}: Collapse along one dimension in a well-defined way.
+  \item method \python{convert}: Convert all axes of a certain kind.
+  \item method \python{create_channel}: Create a new channel.
+  \item method \python{create_variable}: Create a new variable.
+  \item method \python{fullpath}
+  \item method \python{get_nadir}
+  \item method \python{get_zenith}
+  \item method \python{heal}
+  \item attribute \python{kind}
+  \item method \python{level}
+  \item method \python{map_variable}
+  \item attribute \python{natural_name}
+  \item attribute \python{ndim}
+  \item method \python{offset}
+  \item method \python{print_tree}
+  \item method \python{remove_channel}
+  \item method \python{remove_variable}
+  \item method \python{rename_channels}
+  \item method \python{rename_variables}
+  \item attribute \python{shape}
+  \item method \python{share_nans}
+  \item attribute \python{size}
+  \item method \python{smooth}
+  \item attribute \python{source}
+  \item method \python{split}
+  \item method \python{transform}
+  \item attribute \python{units}
+  \item attribute \python{variable_names}
+  \item attribute \python{variables}
+  \item method \python{zoom}
+\end{ditemize}
+
+Each data object contains instances of \python{Channel} and \python{Variable} which represent the
+principle multidimensional arrays.  %
+The following lexicographically lists the attributes of these instances.  %
+Certain methods and attributes are unique to only one type of dataset, and are marked as such.  %
+\begin{ditemize}
+  \item method \python{argmax}
+  \item method \python{argmin}
+  \item method \python{chunkwise}
+  \item method \python{clip}
+  \item method \python{convert}
+  \item attribute \python{full}
+  \item attribute \python{fullpath}
+  \item attribute \python{label} (variable only)
+  \item method \python{log}
+  \item method \python{log10}
+  \item method \python{log2}
+  \item method \python{mag}
+  \item attribute \python{major_extent} (channel only)
+  \item method \python{max}
+  \item method \python{min}
+  \item attribute \python{minor_extent} (channel only)
+  \item attribute \python{natural_name}
+  \item method \python{normalize} (channel only)
+  \item attribute \python{null} (channel only)
+  \item attribute \python{parent}
+  \item attribute \python{points}
+  \item attribute \python{signed} (channel only)
+  \item method \python{slices}
+  \item method \python{symmetric_root}
+  \item method \python{trim} (channel only)
+\end{ditemize}
+Channels and variables also support direct indexing / slicing using \python{__getitem__}, as
+discussed more in...  % TODO: where is it discussed more?
+ 
+Axes are ways to organize data as functional of particular variables (and combinations thereof).  %
+The \python{Axis} class does not directly contain the respective arrays---it refers to the
+associated variables.  %
+The flexibility of this association is one of the main new features in WrightTools 3.  %
+Axis expressions are simple human-friendly strings made up of numbers and variable
+\python{natural_name}s.  %
+Given 5 variables with names \python{['w1', 'w2', 'wm', 'd1', 'd2']}, example valid expressions
+include \python{'w1'}, \python{'w1=wm'}, \python{'w1+w2'}, \python{'2*w1'}, \python{'d1-d2'}, and
+\python{'wm-w1+w2'}.  %
+Axes can be directly indexed / sliced into using \python{__getitem__}, and they support many of the
+``numpy-like'' attributes.  %
+A lexicographical list of axis attributes and methods follows.
+\begin{ditemize}
+  \item attribute \python{full}
+  \item attribute \python{label}
+  \item attribute \python{natural_name}
+  \item attribute \python{ndim}
+  \item attribute \python{points}
+  \item attribute \python{shape}
+  \item attribute \python{size}
+  \item attribute \python{units_kind}
+  \item attribute \python{variables}
+  \item method \python{convert}
+  \item method \python{min}
+  \item method \python{max}
+\end{ditemize}  % TODO: actually lexicographical
+
+\subsection{Creating a data object}  % ------------------------------------------------------------
+
+WrightTools data objects are capable of storing arbitrary multidimensional spectra, but how can we
+actually get data into WrightTools?  %
+If you start with a wt5 file, the answer is easy: \python{wt.open(<filepath>)}.  %
+But what if you have data that was written using some other software?  %
+WrightTools offers data conversion functions (``from'' functions) that do the hard work of creating
+data objects from other files.  %
+These from-functions are as parameter free as possible, which means they recognize details like
+shape and units from each specific file format without manual user intervention.  %
+
+The most important thing about from-functions is that they are extensible: that is, that more
+from-functions can be easily added as needed.  %
+This modular approach to data creation means that individuals who want to use WrightTools for new
+data sources can simply add one function to unlock the capabilities of the entire package as
+applied to their data.  %
+
+Following are the current from-functions, and the types of data that they support.
+\begin{ditemize}
+  \item Cary (collection creation)
+  \item COLORS
+  \item KENT
+  \item PyCMDS
+  \item Ocean Optics
+  \item Shimadzu
+  \item Tensor27
+\end{ditemize}  % TODO: complete list, update wright.tools to be consistent
+  
+\subsubsection{Discover dimensions}
+
+Certain older Wright Group file types (COLORS and KENT) are particularly difficult to import using
+a parameter-free from-function.  %
+There are two problems:
+\begin{ditemize}
+  \item Dimensionality limitation to individual files (1D for KENT, 2D for COLORS).
+  \item Lack of self-describing metadata.
+\end{ditemize}
+The way that WrightTools handles data creation for these file-types deserves special discussion.  %
+
+Firstly, WrightTools contains hardcoded column information for each filetype...
+For COLORS...  % TODO
+
+Secondly, WrightTools accepts a list of files which it stacks together to form a single large
+array.  %
+
+Finally, the \python{wt.kit.discover_dimensions} function is called.  %
+This function does its best to recognize the parameters of the original scan...  % TODO
+
+\subsubsection{From directory}
+
+% TODO (also document on wright.tools)
+
+\subsection{Math}  % ------------------------------------------------------------------------------
+
+Now that we know the basics of how the WrightTools \python{Data} class stores data, it's time to do
+some data manipulation.  %
+Let's start with some elementary algebra.  %
+
+\subsubsection{In place operators}
+
+Operators are...  % TODO
+Because the \python{Data} object is mostly stored outside of memory, it is better to do
+in-place... % TODO
+
+Broadcasting... % TODO
+
+\subsubsection{Clip}
+
+% TODO
+
+\subsubsection{Symmetric root}
+
+% TODO
+
+\subsubsection{Log}
+
+% TODO
 
 \subsection{Dimensionality manipulation}  % -------------------------------------------------------
 
+WrightTools offers several strategies for reducing the dimensionality of a data object.  %
+Also consider using the fit sub-package.  % TODO: more info, link to section
+
 \subsubsection{Chop}
 
+Chop is one of the most important methods of data, although it is typically not called directly by
+users of WrightTools.  %
+
 \subsubsection{Collapse}
 
 \subsubsection{Split}
 
+\subsubsection{Join}
+
+\subsection{The wt5 file format}  % ---------------------------------------------------------------
+
+Since WrightTools is based on the hdf5 file format...  % TODO
+
 \section{Artists}  % ==============================================================================
 
 After importing and manipulating data, one typically wants to create a plot.  %
@@ -134,9 +320,10 @@ In the future, other libraries (e.g. mayavi), may be incorporated.  %
 
 \section{Fitting}  % ==============================================================================
 
-
-
 \section{Distribution and licensing} \label{sec:processing_disbribution}  % =======================
 
+WrightTools is MIT licensed.  %
+
+WrightTools is distributed on PyPI and conda-forge.
 
 \section{Future directions}  % ====================================================================
\ No newline at end of file
-- 
cgit v1.2.3