aboutsummaryrefslogtreecommitdiff
path: root/processing
diff options
context:
space:
mode:
authorBlaise Thompson <blaise@untzag.com>2018-03-19 17:11:06 -0500
committerBlaise Thompson <blaise@untzag.com>2018-03-19 17:11:06 -0500
commitac6bde61f90e6684b5b5b79286ebe58d08c09f9c (patch)
tree36526be205a9c0149347d0faa1b2feec309a4dd8 /processing
parentf5b439d81f44dad0345031d1d72a783b4b429ef9 (diff)
2018-03-19 17:11
Diffstat (limited to 'processing')
-rw-r--r--processing/chapter.tex223
1 files changed, 205 insertions, 18 deletions
diff --git a/processing/chapter.tex b/processing/chapter.tex
index 9c9ccab..0e0e5cb 100644
--- a/processing/chapter.tex
+++ b/processing/chapter.tex
@@ -17,7 +17,6 @@ WrightTools is a software package at the heart of all work in the Wright Group.
% TODO: more intro
-
WrightTools is written in Python, and endeavors to have a ``pythonic'', explicit and ``natural''
application programming interface (API). %
To use WrightTools, simply import:
@@ -29,7 +28,7 @@ To use WrightTools, simply import:
I'll discuss more about how exactly WrightTools packaging, distribution, and instillation works in
\autoref{sec:processing_distbribution}.
-We can use the builtin Python function \mintinline{python}{dir} to interrogate the contents of the
+We can use the builtin Python function \python{dir} to interrogate the contents of the
WrightTools package. %
\begin{codefragment}{python}
>>> dir(wt)
@@ -59,53 +58,240 @@ WrightTools package. %
'kit',
'open',
'units']
-\end{codefragment}
+\end{codefragment} % TODO: consider adding fit to this list
Many of these are dunder (double underscore) attributes---Python internals that are not normally
used directly. %
The ten attributes that do not start with underscore are the public API that users of WrightTools
typically use. %
-Within the public API are two classes, \mintinline{python}{Collection} \&
-\mintinline{python}{Data}, which are the two main classes in the WrightTools object model. %
-\mintinline{python}{Data} stores spectra directly as multidimensional arrays, and
-\mintinline{python}{Collection} stores \textit{groups} of data objects (and other collection
+Within the public API are two classes, \python{Collection} \&
+\python{Data}, which are the two main classes in the WrightTools object model. %
+\python{Data} stores spectra directly as multidimensional arrays, and
+\python{Collection} stores \textit{groups} of data objects (and other collection
objects) in a hierarchical way for internal organization purposes. %
\section{Data object model} % ====================================================================
WrightTools uses a programming strategy called object oriented programming (OOP). %
+% TODO: introduce HDF5
+% TODO: elaborate on the concept of OOP and how it relates to WrightTools
It contains a central data ``container'' that is capable of storing all of the information about
-each multidimensional (or one-dimensional) spectra: the \mintinline{python}{Data} class. %
-It also defines a \mintinline{python}{Collection} class that contains data objects, collection
+each multidimensional (or one-dimensional) spectra: the \python{Data} class. %
+It also defines a \python{Collection} class that contains data objects, collection
objects, and other pieces of metadata in a hierarchical structure. %
Let's first discuss \mitinline{python}{Data}.
All spectra are stored within WrightTools as multidimensional arrays. %
Arrays are containers that store many instances of the same data type, typically numerical
datatypes. %
-These arrays have some \mintinline{python}{shape}, \mintinline{python}{size}, and
-\mintinline{python}{size}. %
+These arrays have some \python{shape}, \python{size}, and
+\python{dtype}. %
In the context of WrightTools, they can contain floats, integers, complex numbers and NaNs. %
-The \mintinline{python}{Data} class contains everything that is needed to define a single spectra
+The \python{Data} class contains everything that is needed to define a single spectra
from a single experiment (or simulation). %
To do this, each data object contains several multidimensional arrays (typically 2 to 50 arrays,
depending on the kind of data). %
-There are two kinds of arrays, instances of \mintinline{python}{Variable} and
-\mintinline{python}{Channel}. %
+There are two kinds of arrays, instances of \python{Variable} and \python{Channel}. %
Variables are coordinate arrays that define the position of each pixel in the multidimensional
spectrum, and channels are each a particular kind of signal within that spectrum. %
-Typical variables might be \mintinline{python}{[w1, w2, w3, d1, d2]}, and typical channels
-\mintinline{python}{[pmt, pyro1, pyro2, pyro3]}. %
+Typical variables might be \python{[w1, w2, w3, d1, d2]}, and typical channels
+\python{[pmt, pyro1, pyro2, pyro3]}. %
+
+As an overview, the following lexicographically lists the attributes and methods of
+\python{Data}. %
+\begin{ditemize}
+ \item method \python{collapse}: Collapse along one dimension in a well-defined way.
+ \item method \python{convert}: Convert all axes of a certain kind.
+ \item method \python{create_channel}: Create a new channel.
+ \item method \python{create_variable}: Create a new variable.
+ \item method \python{fullpath}
+ \item method \python{get_nadir}
+ \item method \python{get_zenith}
+ \item method \python{heal}
+ \item attribute \python{kind}
+ \item method \python{level}
+ \item method \python{map_variable}
+ \item attribute \python{natural_name}
+ \item attribute \python{ndim}
+ \item method \python{offset}
+ \item method \python{print_tree}
+ \item method \python{remove_channel}
+ \item method \python{remove_variable}
+ \item method \python{rename_channels}
+ \item method \python{rename_variables}
+ \item attribute \python{shape}
+ \item method \python{share_nans}
+ \item attribute \python{size}
+ \item method \python{smooth}
+ \item attribute \python{source}
+ \item method \python{split}
+ \item method \python{transform}
+ \item attribute \python{units}
+ \item attribute \python{variable_names}
+ \item attribute \python{variables}
+ \item method \python{zoom}
+\end{ditemize}
+
+Each data object contains instances of \python{Channel} and \python{Variable} which represent the
+principle multidimensional arrays. %
+The following lexicographically lists the attributes of these instances. %
+Certain methods and attributes are unique to only one type of dataset, and are marked as such. %
+\begin{ditemize}
+ \item method \python{argmax}
+ \item method \python{argmin}
+ \item method \python{chunkwise}
+ \item method \python{clip}
+ \item method \python{convert}
+ \item attribute \python{full}
+ \item attribute \python{fullpath}
+ \item attribute \python{label} (variable only)
+ \item method \python{log}
+ \item method \python{log10}
+ \item method \python{log2}
+ \item method \python{mag}
+ \item attribute \python{major_extent} (channel only)
+ \item method \python{max}
+ \item method \python{min}
+ \item attribute \python{minor_extent} (channel only)
+ \item attribute \python{natural_name}
+ \item method \python{normalize} (channel only)
+ \item attribute \python{null} (channel only)
+ \item attribute \python{parent}
+ \item attribute \python{points}
+ \item attribute \python{signed} (channel only)
+ \item method \python{slices}
+ \item method \python{symmetric_root}
+ \item method \python{trim} (channel only)
+\end{ditemize}
+Channels and variables also support direct indexing / slicing using \python{__getitem__}, as
+discussed more in... % TODO: where is it discussed more?
+
+Axes are ways to organize data as functional of particular variables (and combinations thereof). %
+The \python{Axis} class does not directly contain the respective arrays---it refers to the
+associated variables. %
+The flexibility of this association is one of the main new features in WrightTools 3. %
+Axis expressions are simple human-friendly strings made up of numbers and variable
+\python{natural_name}s. %
+Given 5 variables with names \python{['w1', 'w2', 'wm', 'd1', 'd2']}, example valid expressions
+include \python{'w1'}, \python{'w1=wm'}, \python{'w1+w2'}, \python{'2*w1'}, \python{'d1-d2'}, and
+\python{'wm-w1+w2'}. %
+Axes can be directly indexed / sliced into using \python{__getitem__}, and they support many of the
+``numpy-like'' attributes. %
+A lexicographical list of axis attributes and methods follows.
+\begin{ditemize}
+ \item attribute \python{full}
+ \item attribute \python{label}
+ \item attribute \python{natural_name}
+ \item attribute \python{ndim}
+ \item attribute \python{points}
+ \item attribute \python{shape}
+ \item attribute \python{size}
+ \item attribute \python{units_kind}
+ \item attribute \python{variables}
+ \item method \python{convert}
+ \item method \python{min}
+ \item method \python{max}
+\end{ditemize} % TODO: actually lexicographical
+
+\subsection{Creating a data object} % ------------------------------------------------------------
+
+WrightTools data objects are capable of storing arbitrary multidimensional spectra, but how can we
+actually get data into WrightTools? %
+If you start with a wt5 file, the answer is easy: \python{wt.open(<filepath>)}. %
+But what if you have data that was written using some other software? %
+WrightTools offers data conversion functions (``from'' functions) that do the hard work of creating
+data objects from other files. %
+These from-functions are as parameter free as possible, which means they recognize details like
+shape and units from each specific file format without manual user intervention. %
+
+The most important thing about from-functions is that they are extensible: that is, that more
+from-functions can be easily added as needed. %
+This modular approach to data creation means that individuals who want to use WrightTools for new
+data sources can simply add one function to unlock the capabilities of the entire package as
+applied to their data. %
+
+Following are the current from-functions, and the types of data that they support.
+\begin{ditemize}
+ \item Cary (collection creation)
+ \item COLORS
+ \item KENT
+ \item PyCMDS
+ \item Ocean Optics
+ \item Shimadzu
+ \item Tensor27
+\end{ditemize} % TODO: complete list, update wright.tools to be consistent
+
+\subsubsection{Discover dimensions}
+
+Certain older Wright Group file types (COLORS and KENT) are particularly difficult to import using
+a parameter-free from-function. %
+There are two problems:
+\begin{ditemize}
+ \item Dimensionality limitation to individual files (1D for KENT, 2D for COLORS).
+ \item Lack of self-describing metadata.
+\end{ditemize}
+The way that WrightTools handles data creation for these file-types deserves special discussion. %
+
+Firstly, WrightTools contains hardcoded column information for each filetype...
+For COLORS... % TODO
+
+Secondly, WrightTools accepts a list of files which it stacks together to form a single large
+array. %
+
+Finally, the \python{wt.kit.discover_dimensions} function is called. %
+This function does its best to recognize the parameters of the original scan... % TODO
+
+\subsubsection{From directory}
+
+% TODO (also document on wright.tools)
+
+\subsection{Math} % ------------------------------------------------------------------------------
+
+Now that we know the basics of how the WrightTools \python{Data} class stores data, it's time to do
+some data manipulation. %
+Let's start with some elementary algebra. %
+
+\subsubsection{In place operators}
+
+Operators are... % TODO
+Because the \python{Data} object is mostly stored outside of memory, it is better to do
+in-place... % TODO
+
+Broadcasting... % TODO
+
+\subsubsection{Clip}
+
+% TODO
+
+\subsubsection{Symmetric root}
+
+% TODO
+
+\subsubsection{Log}
+
+% TODO
\subsection{Dimensionality manipulation} % -------------------------------------------------------
+WrightTools offers several strategies for reducing the dimensionality of a data object. %
+Also consider using the fit sub-package. % TODO: more info, link to section
+
\subsubsection{Chop}
+Chop is one of the most important methods of data, although it is typically not called directly by
+users of WrightTools. %
+
\subsubsection{Collapse}
\subsubsection{Split}
+\subsubsection{Join}
+
+\subsection{The wt5 file format} % ---------------------------------------------------------------
+
+Since WrightTools is based on the hdf5 file format... % TODO
+
\section{Artists} % ==============================================================================
After importing and manipulating data, one typically wants to create a plot. %
@@ -134,9 +320,10 @@ In the future, other libraries (e.g. mayavi), may be incorporated. %
\section{Fitting} % ==============================================================================
-
-
\section{Distribution and licensing} \label{sec:processing_disbribution} % =======================
+WrightTools is MIT licensed. %
+
+WrightTools is distributed on PyPI and conda-forge.
\section{Future directions} % ==================================================================== \ No newline at end of file